Check predictions near threshold

felixhekhorn commented 1 year ago

Follow up of https://github.com/NNPDF/fktables/issues/19

when the final scale is close to a threshold (typically the bottom threshold), we can see some discrepancy with the LHAPDF grids.

It was conjectured, this might be related to the lhapdf interpolation - we should check explicitly by comparing with apfel (to which we should match exactly) and LHAPDF with the same settings.

giacomomagni commented 1 year ago

@felixhekhorn, @andreab1997 do we still have this problem?

scarlehoff commented 1 year ago

https://github.com/NNPDF/fktables/issues/19#issuecomment-1048714053 - we should check explicitly by comparing with apfel (to which we should match exactly) and LHAPDF with the same settings.

I don't think this is due to the LHAPDF interpolation because the difference is seen even when not using LHAPDF but just inspecting the files.

felixhekhorn commented 1 year ago

On a first naive attempt I can not reproduce this ... setup:

       hash                            q2s  theory   ocard                    pdf external       ctime
uid                                                                                                   
52   a9b7ef  {24.1081, 24.304899999999996}  a846b3  a57935  NNPDF40_nnlo_as_01180    apfel  11 seconds

with g(t,"a84")["mb"] -> 4.92 and {"mugrid": [4.91, 4.93]} I get for the gluon

               x       Q2           eko     eko_error         apfel  percent_error
0   1.000000e-07  24.1081  4.263117e+01  4.444656e-05  4.267273e+01      -0.097399
1   1.610262e-07  24.1081  4.091728e+01  5.398982e-05  4.095536e+01      -0.092999
2   2.592944e-07  24.1081  3.916259e+01  4.878524e-05  3.919733e+01      -0.088626
...
42  7.157895e-01  24.1081  1.228587e-03  2.405930e-09  1.228232e-03       0.028904
43  7.631579e-01  24.1081  4.300323e-04  7.136010e-10  4.310587e-04      -0.238104
44  8.105263e-01  24.1081  1.264622e-04  1.660539e-10  1.275819e-04      -0.877621
45  8.578947e-01  24.1081  2.927450e-05  7.045419e-11  2.997198e-05      -2.327116
46  9.052632e-01  24.1081  4.614323e-06  4.114902e-12  4.907490e-06      -5.973878
47  9.526316e-01  24.1081  7.852199e-08  2.155516e-11  3.979751e-07     -80.269622
48  1.000000e+00  24.1081 -3.308722e-24  0.000000e+00  0.000000e+00           -inf
49  1.000000e-07  24.3049  4.483622e+01  5.354970e-05  4.487891e+01      -0.095128
50  1.610262e-07  24.3049  4.293511e+01  7.699355e-05  4.297590e+01      -0.094913
51  2.592944e-07  24.3049  4.100491e+01  6.178382e-05  4.104197e+01      -0.090296
...
91  7.157895e-01  24.3049  1.208708e-03  2.496304e-09  1.208382e-03       0.026961
92  7.631579e-01  24.3049  4.228880e-04  7.379125e-10  4.239238e-04      -0.244346
93  8.105263e-01  24.3049  1.243156e-04  1.731014e-10  1.254347e-04      -0.892225
94  8.578947e-01  24.3049  2.876741e-05  7.184019e-11  2.946004e-05      -2.351060
95  9.052632e-01  24.3049  4.528945e-06  4.792545e-12  4.821592e-06      -6.069511
96  9.526316e-01  24.3049  1.919701e-08  2.381066e-11  3.908008e-07     -95.087775
97  1.000000e+00  24.3049 -3.308722e-24  0.000000e+00  0.000000e+00           -inf

where the last point in each $Q^2$ slice are beyond our absolute error and we're willing to accept

scarlehoff commented 1 year ago

You have many points that are above the per-mille error (which is my threshold) so this might be just a difference on what we consider an acceptable error.

For instance: 43 7.631579e-01 24.1081 4.300323e-04 7.136010e-10 4.310587e-04 -0.238104 this I'd flag as an error in my comparison script, and you have many that are close enough to the threshold (0.092999) that could conceivably show up due to small perturbations.

felixhekhorn commented 1 year ago

If I correctly read the corresponding APFEL function CachePDFsAPFEL, then I believe APFEL is using the following strategy:

determine the number of Q2 points per subgrid here - the Q2g is computed only for this
compute the actual Q2 points inside each subgrid here
the Q2 grid is spanning between $(1+\epsilon) Q{min}$ (here) and $(1-\epsilon) Q{max}$ (here) with $\epsilon = 10^{-14}$ (here)
then what is happening at exactly the mass, I don't know ... most likely APFEL is interpolating (as everywhere else) - though right above it should be fine ...

alecandido commented 1 year ago

You have many points that are above the per-mille error (which is my threshold) so this might be just a difference on what we consider an acceptable error.

For instance: 43 7.631579e-01 24.1081 4.300323e-04 7.136010e-10 4.310587e-04 -0.238104 this I'd flag as an error in my comparison script, and you are many that are close enough to the threshold (0.092999) that could conceivably show up due to small perturbations.

This is exactly the reason why we are not performing automated comparison:

you can see that there's a constant trend, relative errors are spoiling in large-x, especially because the absolute is going down, so the absolute errors are small anyhow
in the large-x region things are largely interpolation dependent, and there might be significant discrepancies anyhow
for non-positive distributions there are crossing signs region, that are spoiling the comparison anyhow

You should at least include both relative and absolute thresholds in your comparison, and yet it might not be enough to cover for all the known and not relevant quirks...

scarlehoff commented 1 year ago

You have problems also with bigger values:

51  2.592944e-07  24.3049  4.100491e+01  6.178382e-05  4.104197e+01      -0.090296

As I said, if 0.09 is acceptable for you then this is just a difference on what we are considering acceptable (which might be fine). As I said before, indeed when I cut out the xmin the agreement is much better but is still not perfect.

In any case what worries me is that Q=mb is singled out.

alecandido commented 1 year ago

As I said, if 0.09 is acceptable for you then this is just a difference on what we are considering acceptable (which might be fine). As I said before, indeed when I cut out the xmin the agreement is much better but is still not perfect.

There are cases in which we could considerably go down in discrepancy, but the different integration and interpolation procedure are putting some boundaries.

Usually, we consider we have a problem if we have a consistently shifted behavior, while single points are not that relevant (that's why we struggled implementing this criterion in a piece of code - we tried, but it got too complex, especially in 2D, where we could have 1D defects). Unless there is something that makes the point special.

In any case what worries me is that Q=mb is singled out.

Ok, this is worrying. But there are two different scenarios:

the problem is around $Q= m_b$, then there is something weird happening, and should be understood
the problem is at $Q=m_b$, then something is just messed up somewhere, like for apply_pdf, but this should be a minor problem (in $n_f=4$, while in $n_f=5$ it could be a serious problem with the matching)

scarlehoff commented 1 year ago

At nf=4 (i.e., below threshold): Only the point Q=4.92 GeV generates problems. And, if I put the rtol of the agreement at 1e-2 only a few points (and always for the gluon) show up in the check. They are all at Q=4.92 and they are all below 1e-7.

At nf=5, instead, many of the points at the threshold are wrong (so Q=4.92 GeV). As one gets farther away from the threshold less and less points are wrong... If I only look above 1e-7 then it's only Q=4.92 GeV. So everything wrong beyond the threshold seem to be also at very low x.

In order to avoid looking at very small numbers dancing around 0 I set up an absolute tolerance (quite big) of 1e-5 which further removes points from the check but not all.

Now, inspecting the rows one thing I'm noticing is that the PDF evolved with apfel has b(Q=mb) = 0.0 everywhere while eko doesn't and now I'm wondering whether the glitch where the value for f(Q=mb-) is being replaced with f(Q=mb+) is happening in apfel but in the opposite direction...

felixhekhorn commented 1 year ago

in the large-x region things are largely interpolation dependent,

let me stress this point: in the above comparison we were using the degree-4 interpolation and an xgrid of make_grid(30, 20) - so larges 5 points (with the worst discrepancy) are on the edge of interpolation

I repeated the exercise with lambertgrid(120) and I get again for the gluon showing all points with more than 1/1e3 off:

In [20]: gg[abs(gg["percent_error"]) > 0.1]
Out[20]: 
                x       Q2           eko     eko_error         apfel  percent_error
113  8.254176e-01  24.1081  8.245832e-05  2.000144e-10  8.255450e-05      -0.116500
114  8.540825e-01  24.1081  3.345233e-05  5.991384e-11  3.354433e-05      -0.274285
115  8.829312e-01  24.1081  1.182693e-05  2.054497e-11  1.189351e-05      -0.559871
116  9.119551e-01  24.1081  3.388340e-06  4.630859e-12  3.425517e-06      -1.085310
117  9.411462e-01  24.1081  6.585714e-07  1.849335e-12  6.858106e-07      -3.971819
118  9.704968e-01  24.1081  3.708396e-08  2.949383e-12  6.528939e-08     -43.200627
119  1.000000e+00  24.1081 -3.308722e-24  0.000000e+00  0.000000e+00           -inf
120  1.000000e-07  24.3049  4.483168e+01  4.851796e-03  4.487988e+01      -0.107388
121  1.194184e-07  24.3049  4.411498e+01  1.016400e-02  4.417525e+01      -0.136427
122  1.426075e-07  24.3049  4.341282e+01  8.298050e-03  4.346536e+01      -0.120872
123  1.702995e-07  24.3049  4.270099e+01  1.182497e-02  4.275080e+01      -0.116508
124  2.033689e-07  24.3049  4.198471e+01  1.176604e-02  4.203211e+01      -0.112765
125  2.428598e-07  24.3049  4.126487e+01  6.592086e-03  4.130986e+01      -0.108892
126  2.900192e-07  24.3049  4.054194e+01  3.455001e-03  4.058457e+01      -0.105025
127  3.463362e-07  24.3049  3.981627e+01  2.136799e-03  3.985677e+01      -0.101622
233  8.254176e-01  24.3049  8.105428e-05  2.291362e-10  8.115293e-05      -0.121559
234  8.540825e-01  24.3049  3.287482e-05  6.992960e-11  3.296829e-05      -0.283528
235  8.829312e-01  24.3049  1.161805e-05  2.343353e-11  1.168529e-05      -0.575396
236  9.119551e-01  24.3049  3.325930e-06  5.427210e-12  3.363299e-06      -1.111080
237  9.411462e-01  24.3049  6.418224e-07  2.861387e-12  6.726322e-07      -4.580478
238  9.704968e-01  24.3049  1.050610e-08  4.954262e-12  6.397652e-08     -83.578193
239  1.000000e+00  24.3049 -3.308722e-24  0.000000e+00  0.000000e+00           -inf

which I consider an improvement ... moreover, remember that I run APFEL with Fast evolution enabled, i.e. APFEL is still interpolating on x (we are not - output points are interpolation points)

In any case what worries me is that Q=mb is singled out.

just as a reminder: EKO can deal with the singularity (because we can compute in either FNS), but APFEL can't

scarlehoff commented 1 year ago

Inspecting the numbers it seems that exactly at Q=mb the evolution scripts with apfel and eko are just doing something different... the discrepancy with the LHAPDF grids might actually be due to that.

felixhekhorn commented 1 year ago

moreover, remember that I run APFEL with Fast evolution enabled, i.e. APFEL is still interpolating on x (we are not - output points are interpolation points)

actually, I'm no longer sure about this, because use_external_grid=True in banana

scarlehoff commented 1 year ago

I think this can be closed. The difference seem to be due to a difference in convention at exactly the quark thresholds.

Previously the first point at nf was "forced" to be the same as the last point of nf-1. With eko instead both blocks are treated separately. This would also explain changes seen around threshold since one of the points potentially entering the interpolation was just different.

alecandido commented 1 year ago

I argue that is not a convention, and even APFEL is able to apply the matching properly.

The only problem might be retrieving the correct result, and treat it separately, that was a problem also with EKO, before #242 (indeed we had custom mechanisms for getting it correct in special cases, like fitting scale or $\alpha_s$ scale sitting on a threshold... now they are all obsolete, since they should be consistently evolution points).

felixhekhorn commented 1 year ago

I think this can be closed.

agreed

The only problem might be retrieving the correct result

see #283

since they should be consistently evolution points

see #265

NNPDF / eko

Check predictions near threshold #173