NNPDF / pinecards

Runcards needed to generate PineAPPL grids for NNPDF processes
3 stars 1 forks source link

Add positivity support #132

Closed cschwan closed 2 years ago

cschwan commented 2 years ago

Merging this PR will close #124.

Please have a look at this, that's the first Python code other than matplotlib scripts I've written in a while, every suggestion is welcome. At this point it's sort of a MVP, but a few things have to be done still: scale variation, metadata, ?. The x-grid must be specified in the runcard, let me know if that's a good idea.

alecandido commented 2 years ago

Just a spare comment: I have no better proposal, but personally I find Q15 a bit confusing. It was not clear to me that it meant q = 1.5...

cschwan commented 2 years ago

Just a spare comment: I have no better proposal, but personally I find Q15 a bit confusing. It was not clear to me that it meant q = 1.5...

Agreed. If you have a better suggestion, I'd be happy to adopt it!

felixhekhorn commented 2 years ago

Just a spare comment: I have no better proposal, but personally I find Q15 a bit confusing. It was not clear to me that it meant q = 1.5...

Agreed. If you have a better suggestion, I'd be happy to adopt it!

Do we need this name tag? it was not present in the old name and I wonder how quickly it will change (the xgrid isn't part of the name either and is at least equally important ...)

cschwan commented 2 years ago

@felixhekhorn good point. Do you know where the old positivity grids are and how they are named? Maybe we can add the NNPDF version, when they were first added, to the positivity dataset name. So something like NNPDF_POS_CHARM_40; then we know this is the positivity dataset used in 4.0. But I don't have a strong opinion about this, we should just have something more descriptive than NNPDF_POS_CHARM.

cschwan commented 2 years ago

Another point: does the q parameter in the runcard coincide with the fitting scale? Because the Grid that we generate happens to be also an FkTable (from PineAPPL's point of view, without evolution).

alecandido commented 2 years ago

Another point: does the q parameter in the runcard coincide with the fitting scale? Because the Grid that we generate happens to be also an FkTable (from PineAPPL's point of view, without evolution).

I believe it's not required: indeed Tommaso is about to propose to raise a bit the scale at which positivity is imposed. For sure you have to specify explicitly, but you can assume that evolution is always taking place (if it's trivial, then it is eko's responsibility to provide a trivial evolution operator).

felixhekhorn commented 2 years ago

Another point: does the q parameter in the runcard coincide with the fitting scale? Because the Grid that we generate happens to be also an FkTable (from PineAPPL's point of view, without evolution).

felixhekhorn commented 2 years ago

@felixhekhorn good point. Do you know where the old positivity grids are and how they are named? Maybe we can add the NNPDF version, when they were first added, to the positivity dataset name. So something like NNPDF_POS_CHARM_40; then we know this is the positivity dataset used in 4.0.

But I don't have a strong opinion about this, we should just have something more descriptive than NNPDF_POS_CHARM.

I see your point, but I've no good suggestion ...

cschwan commented 2 years ago

Another point: does the q parameter in the runcard coincide with the fitting scale? Because the Grid that we generate happens to be also an FkTable (from PineAPPL's point of view, without evolution).

* No, because the 4.0 fitting scale is 1.65 GeV

* even more this scale would be a true problem  in the old setup as it is below the charm threshold (mc = 1.51 GeV)

We could add some metadata that specifies the 'positivity scale`. The fit can then check for potential pitfalls.

scarlehoff commented 2 years ago

MVP

Most Valuable Positivity

But I don't have a strong opinion about this, we should just have something more descriptive than NNPDF_POS_CHARM.

Why is NNPDF_POS_CHARM not descriptive enough?

I would not add the value of Q to the name because it can be the case that for different theories we want to use a different Q but conceptually the positivity is the same so I would like to be able to compare NNPDF_POS_CHARM for fits A and B without having to look for scales or anything.

We could add some metadata that specifies the 'positivity scale`. The fit can then check for potential pitfalls.

exactly

alecandido commented 2 years ago

We could add some metadata that specifies the 'positivity scale`. The fit can then check for potential pitfalls.

Actually, grid level this information is present (because is the scale contained in the subgrid). We just need to make sure the information is propagated down to FkTables, where it would be flushed by evolution.

felixhekhorn commented 2 years ago

We could add some metadata that specifies the 'positivity scale`. The fit can then check for potential pitfalls.

Actually, grid level this information is present (because is the scale contained in the subgrid). We just need to make sure the information is propagated down to FkTables, where it would be flushed by evolution.

we can just add this here: https://github.com/NNPDF/runcards/blob/ecd40ef995b3f2f06257a448a4f5ec7fafe21539/runcardsrunner/external/positivity.py#L73

alecandido commented 2 years ago

we can just add this here:

https://github.com/NNPDF/runcards/blob/ecd40ef995b3f2f06257a448a4f5ec7fafe21539/runcardsrunner/external/positivity.py#L73

For the easiness of parsing, I would add a separate key with a single float. Everything in PineAPPL metadata is string based, so in that case, to achieve a simple enough result, you should turn that value into a JSON object -> better a separate key.

cschwan commented 2 years ago

We could add some metadata that specifies the 'positivity scale`. The fit can then check for potential pitfalls.

Actually, grid level this information is present (because is the scale contained in the subgrid). We just need to make sure the information is propagated down to FkTables, where it would be flushed by evolution.

This scale is implicitly contained in Grid (the scale of the grid is the positivity scale), but the evolution changes the scale to the fitting scale and then the positivity scale is lost, so we'd need to put this into metadata.

cschwan commented 2 years ago

... on the other hand it will be contained in the runcard metadata which is also yaml. This is still missing though.

cschwan commented 2 years ago

Commit 2b9de22aa3db7cf4c413fb4d41f82dbdabdaf5bb writes the complete runcard as a JSON string into the metadata just like yadism, and therefore automatically contains the positivity scale q.

cschwan commented 2 years ago

What's finally left is to implement all the positivity datasets themselves, so

How can I find out what the other datasets are? What are the x grid values? What's the q value?

alecandido commented 2 years ago

All POSX... are just PDF flavors, those are as simple as the one you implemented.

All POS2F... are actually DIS structure functions. Exactly which ones God only knows (or who implemented them last time)...

About DY, I know nothing...

About the kinematics (for all of them): I had a look in the 4.0 paper for xgrid and Q2 value, but I found nothing. Maybe we should ask in Amsterdam (I'll do it, if you remind me... email is fine, or I can try to put a memo on the phone...), otherwise we can just send an email to the mailing list.

P.S.: you can try to have a look at Tommaso's PhD thesis, he implemented positivity in 4.0. Otherwise we can invoke @scarlehoff help...

enocera commented 2 years ago

Concerning positivity constraints on DIS structure functions and DY cross sections, I suggest that you have a look at Sect. 3.2.3 in https://arxiv.org/pdf/1410.8849.pdf, and in particular at Eqs.(14). As for the kinematics, this is defined in https://github.com/NNPDF/nnpdf/blob/master/buildmaster/filters/POS.cc, for the number of data points in the POS* files here https://github.com/NNPDF/nnpdf/tree/master/buildmaster/meta

enocera commented 2 years ago

So, for instance, POSF2U is the contribution from u and ubar PDFs to the DIS structure function F2; POSFLL is the contribution from all light quark and gluon PDFs to the DIS structure function FL; POSDYU is the contribution of the u*ubar PDFs to the DY differential cross-section.

enocera commented 2 years ago

And please note that in positivity constaints like POSXUQ you really need to consider x*PDF, not just the PDF.

cschwan commented 2 years ago

And please note that in positivity constaints like POSXUQ you really need to consider x*PDF, not just the PDF.

Thanks @enocera, this might be wrong in the current implementation. But that'll be the first thing that I'll try when comparing old against new tables.

cschwan commented 2 years ago

Concerning positivity constraints on DIS structure functions and DY cross sections, I suggest that you have a look at Sect. 3.2.3 in https://arxiv.org/pdf/1410.8849.pdf, and in particular at Eqs.(14). As for the kinematics, this is defined in https://github.com/NNPDF/nnpdf/blob/master/buildmaster/filters/POS.cc, for the number of data points in the POS* files here https://github.com/NNPDF/nnpdf/tree/master/buildmaster/meta

Perfect, I think that's all I need to know!

enocera commented 2 years ago

Incidentally, the FK tables for the DY positivity constraints are generated in the fixed-target configuration using APFEL. Therefore I'm not sure how you're going to replace them.

cschwan commented 2 years ago

But I don't have a strong opinion about this, we should just have something more descriptive than NNPDF_POS_CHARM.

Why is NNPDF_POS_CHARM not descriptive enough?

We might want to change the x grid in a future fit. Right now, I can see that there are 20 x points which don't agree with our 50 default grid points that both Madgraph5_aMC@NLO and yadism choose. In that case the file positivity.yaml will be different and I'd consider that a different 'positivity observable', which would also be visible in validphys. Does that make sense to you, @scarlehoff?

scarlehoff commented 2 years ago

I see. I guess I would argue we want to have the same default grids as Madgraph and yadism and set that as the NNPDF_POS_CHARM and have some transitional names in the middle if we want.

It is different from the current positivity observable but it will be always the same from now on.

cschwan commented 2 years ago

Incidentally, the FK tables for the DY positivity constraints are generated in the fixed-target configuration using APFEL. Therefore I'm not sure how you're going to replace them.

@enocera ~where do you see this?~ It's written in the paper you linked above. I think that means we have a job for yadism @felixhekhorn @AleCandido!

enocera commented 2 years ago

@cschwan In the apfelcomb database

Incidentally, the FK tables for the DY positivity constraints are generated in the fixed-target configuration using APFEL. Therefore I'm not sure how you're going to replace them.

@enocera where do you see this?

In the apfelcomb database e.g. https://github.com/NNPDF/apfelcomb/blob/f49871435792bff1e269d524c3e6861aa6dd3e28/db/apfelcomb.dat#L630

enocera commented 2 years ago

We might want to change the x grid in a future fit. Right now, I can see that there are 20 x points which don't agree with our 50 default grid points that both Madgraph5_aMC@NLO and yadism choose. In that case the file positivity.yaml will be different and I'd consider that a different 'positivity observable', which would also be visible in validphys. Does that make sense to you, @scarlehoff?

While I understand that we may want to change the x grid in a future fit, I'd like to clarify that, for the DY positivity observable we have 20 x points (where x is the momentum fraction in the PDF - and the set of x values are those on which the positivty constraint is checked) but then we have 40 x interpolation points in the FK table, see https://github.com/NNPDF/apfelcomb/blob/f49871435792bff1e269d524c3e6861aa6dd3e28/db/apfelcomb.dat#L802. Which x are you referring to @cschwan , if I may ask?

cschwan commented 2 years ago

We might want to change the x grid in a future fit. Right now, I can see that there are 20 x points which don't agree with our 50 default grid points that both Madgraph5_aMC@NLO and yadism choose. In that case the file positivity.yaml will be different and I'd consider that a different 'positivity observable', which would also be visible in validphys. Does that make sense to you, @scarlehoff?

While I understand that we may want to change the x grid in a future fit, I'd like to clarify that, for the DY positivity observable we have 20 x points (where x is the momentum fraction in the PDF - and the set of x values are those on which the positivty constraint is checked) but then we have 40 x interpolation points in the FK table, see https://github.com/NNPDF/apfelcomb/blob/f49871435792bff1e269d524c3e6861aa6dd3e28/db/apfelcomb.dat#L802. Which x are you referring to @cschwan , if I may ask?

What I meant was the x grid values of the interpolation grid, because it is beneficial for the speed of the fit for that be the same across all/most datasets (@scarlehoff correct me if I'm wrong). For some positivity datasets where we simply have the xf(x) where x conincides with the momentum fraction, of course, but for DY positivity observables that might be slightly different, but I'm not interested in that for this exercise. Apart from the previously mentioned point we simply have to reproduce the old grids!

cschwan commented 2 years ago

@enocera where can I get the positivity grids for each of the flavours? Do we have them in a form before evolution?

alecandido commented 2 years ago

However, consider that for 4.0 replacement we want to force whole PDFs positivity, but later we might want to impose positivity only at large x. (This is simple, we simply make 0 instead of Kronecker delta the entries we're not interested in).

For POSF... this can be implemented as yadism observable.

About FTDY of course we can do nothing, and as for all the other FTDY observables, we'll keep using old FkTables, up to FTDY provider replacement.

cschwan commented 2 years ago

By the way, how does positivity work when scales are varied? Are they also varied in the positivity grids? Would that make sense?

felixhekhorn commented 2 years ago

Just for the record: POSF2DW is down-like F2: https://github.com/NNPDF/nnpdf/blob/001550dde517f7125a67ed60431c42a51403ba49/buildmaster/meta/POSF2DW.yaml (for whatever reason)

alecandido commented 2 years ago

By the way, how does positivity work when scales are varied? Are they also varied in the positivity grids? Would that make sense?

NNPDF never fits scale varied theories, so they are only used to construct the theory covmat (and that one has no contribution from pseudo-observables, I guess).

enocera commented 2 years ago

@enocera where can I get the positivity grids for each of the flavours? Do we have them in a form before evolution?

Do you mean FK tables? These are available in each theory along with all other FK tables.

alecandido commented 2 years ago

Do you mean FK tables? These are available in each theory along with all other FK tables.

I believe @cschwan really meant grids in the sense of PineAPPL grids (or in this case APPLgrids).

The problem is that they were all generated with APFEL, meaning that they were never dumped in the middle, evolution was applied online. There has never existed anything like a DIS grid before PineAPPL (or FTDY grid, but that is still lacking). As Valerio pointed out once, APFEL is actually computing some kind of grids (otherwise it would be impossible to get FkTables), but they are always kept in memory.

cschwan commented 2 years ago

Note to myself: the theories are stored on the CERN server (look at the NNPDF wiki, 'storage servers'), at /eos/user/n/nnpdf/www/tables/.

alecandido commented 2 years ago

Actually, I'd say that vp downloads its theories from here: https://nnpdf.web.cern.ch/nnpdf/tables/

Is it the same?

scarlehoff commented 2 years ago

@cschwan @AleCandido @felixhekhorn I would like to send a fit with pineko-positivities tomorrow so that it is ready for Amsterdam. Are they ready? (if not I'll do a fit with the old positivities)

alecandido commented 2 years ago

@cschwan @AleCandido @felixhekhorn I would like to send a fit with pineko-positivities tomorrow so that it is ready for Amsterdam. Are they ready? (if not I'll do a fit with the old positivities)

If you want to have PDF positivity, you just need to plug yourself the xgrid and the scale, but the runner is ready.

For everything else, the answer is just no.

P.S.: consider that half of the observables used are FTDY, so the ability of completely generating new positivity FkTables is strictly dependent on the implementation of a FTDY provider

scarlehoff commented 2 years ago

Ok, then I'll do a fit with the old positiv

cschwan commented 2 years ago

@scarlehoff I also have to test this, which means comparing the old FK tables with the grids generated using this generator, but for that I need https://github.com/N3PDF/pineappl/issues/70, and that'll take a while ...

cschwan commented 2 years ago

Using the FK table importer from https://github.com/N3PDF/pineappl/issues/70 it seems that the generator developed in this branch works:

b  x1                         x2                                        diff                
--+-+-+------------------------+------------------------+------------+------------+---------
 0 5 5                0.0000005                0.0000005  2.4475749e0  2.4477671e0 -7.852e-5
 1 5 5 0.0000019407667236782136 0.0000019407667236782136  3.2610530e0  3.2611163e0 -1.943e-5
 2 5 5  0.000007533150951473337  0.000007533150951473337  3.7773898e0  3.7775120e0 -3.235e-5
 3 5 5  0.000029240177382128657  0.000029240177382128657  4.0546412e0  4.0546210e0  4.982e-6
 4 5 5   0.00011349672651536727   0.00011349672651536727  4.1613019e0  4.1613313e0 -7.072e-6
 5 5 5   0.00044054134013486355   0.00044054134013486355  4.1963326e0  4.1963081e0  5.836e-6
 6 5 5    0.0017099759466766963    0.0017099759466766963  4.2374306e0  4.2374955e0 -1.533e-5
 7 5 5     0.006637328831200572     0.006637328831200572  4.0281254e0  4.0281433e0 -4.437e-6
 8 5 5      0.02576301385940815      0.02576301385940815  3.0246906e0  3.0245962e0  3.123e-5
 9 5 5                      0.1                      0.1  1.3753229e0  1.3753758e0 -3.844e-5
10 5 5                     0.18                     0.18 6.3178274e-1 6.3174437e-1  6.073e-5
11 5 5                     0.26                     0.26 3.3070458e-1 3.3069396e-1  3.211e-5
12 5 5      0.33999999999999997      0.33999999999999997 2.0033492e-1 2.0034140e-1 -3.233e-5
13 5 5      0.42000000000000004      0.42000000000000004 1.2173626e-1 1.2174600e-1 -8.004e-5
14 5 5                      0.5                      0.5 6.3538331e-2 6.3534689e-2  5.732e-5
15 5 5                     0.58                     0.58 2.6317776e-2 2.6314609e-2  1.204e-4
16 5 5                     0.66                     0.66 8.3569885e-3 8.3565226e-3  5.575e-5
17 5 5                     0.74                     0.74 1.8948271e-3 1.8952207e-3 -2.077e-4
18 5 5                     0.82                     0.82 2.7347304e-4 2.7423993e-4 -2.796e-3
19 5 5                      0.9                      0.9 1.9429825e-5 1.9786850e-5 -1.804e-2

The first column in diff is the value of the PineAPPL grid, the second the converted FK table FK_POSXGL.dat and the third column the relative difference, using NNPDF40_nnlo_as_01180. The agreement gets worse in the large x region which we'd probably expect.

cschwan commented 2 years ago

Here the remaining comparisons:

cschwan commented 2 years ago

Commit 7d0bf2f065dd64e4e66118ea57eaa8542bccc809 adds all positivity runcards that I can easily generate.

The only possible issue (@scarlehoff ?) left is that the x at which the positivity of x * f(x) is enforced, which are specified in positivity.yaml, are identical to the x grids of the PineAPPL grids and different from the 50 grid x points we typically use. For convenience of the fit pineko should probably