NNPDF / pinefarm

Generate PineAPPL grids from PineCards
https://pinefarm.readthedocs.io
GNU General Public License v3.0
1 stars 0 forks source link

Add conversion backend #3

Open alecandido opened 2 years ago

alecandido commented 2 years ago

As we have different MC available as back-end (at the moment mg5 and yadism), we should add a conversion back-end powered by pineappl conversion scripts.

Indeed, we are not able to produce all of the grids needed (and we won't be for quite some time), as some of them are the result of MC runs, with some non-publicly available MC. In these cases we're gently gifted the runcards, so we should download them from somewhere else (or have the user running rr downloading them), and then convert to pineappl.

cschwan commented 2 years ago

What exactly should we convert? Can you give an example?

alecandido commented 2 years ago

I'm not so familiar with our APPLgrids/fastNLO grids, but for example (if I remember correctly) we should have grid from NNLOjet, for which we don't have the code, and we won't have soon (if ever) another code to produce them.

Even though, I guess most of the things are actually K-factors... (but we agreed to drop K-factors, didn't we?)

cschwan commented 2 years ago

Do you mean those: https://ploughshare.web.cern.ch/ploughshare/? For them we have appl2pine and fnlo2pine, of course.

alecandido commented 2 years ago

I was not even aware of the website, and I wonder if there are others, since I remember that at the time of 4.0 publication NNPDF contacted people in order to check if it was fine to make grids public (that to me meant some grids were not public yet, but given directly to NNPDF).

alecandido commented 2 years ago

In any case, these or others, since those grids are not PineAPPL grids and we need to convert, I was thinking about make appl2pine and fnlo2pine part of the runcards runner, such that you have a uniform way (a single command) to generate them, even if they are coming by conversion.

Maybe, it would be appropriate to warn explicitly in case of conversion, but this can always be done.

felixhekhorn commented 2 years ago

Maybe, it would be appropriate to warn explicitly in case of conversion, but this can always be done.

Why do we need a warning? it is just generated by some specific program, i.e. wget more or less

felixhekhorn commented 2 years ago

@cschwan to give you a bit more of context: yesterday we were brainstorming a bit about the new "theory layout" (theory in Emanuele sense) and we figured it should be sufficient to have a list of PineAPPL grids together with a file which spells out how to combine the grids together to match to the experimental datasets - the specific format of that file is to be discussed in https://github.com/NNPDF/fktables/issues/12

alecandido commented 2 years ago

The specific program will be appl2pine (for example), and thus pineappl, and wget-like.

In any case, I thought that by default a computation from scratch is expected, since this should run (based on the runcards). If a conversion of a former run is happening behind the scenes, better to warn, or not?

felixhekhorn commented 2 years ago

Ok, I see - but actually it's a bit more complex: the user only specifies the theory card, (which, as you said, would be ignored since we have no other choice), but the "MC runcard" is always in the (runcards) repo and not explicit (so is by chance implicitly the one the NNLOjet people used)

alecandido commented 2 years ago

I guess we have no access to the NNLOjet runcard, since we have even no access to NNLOjet.

So I believe in the corresponding runcards folder, there will be only metadata, and the actual runcard will be replaced with information needed to retrieve the grid...

scarlehoff commented 2 years ago

I think NNLOjet has only been used for K-factors?

alecandido commented 2 years ago

Yeah, but the idea was to get rid of K-factors, and burn them into PineAPPL grids, if I'm not wrong...

scarlehoff commented 2 years ago

Really? In any case, what information do you need from NNLOJET (or any other program?). The K-factors are a bit of a "god given" number.

I think the idea is (or should be) getting rid of K-factors and using NNLO grids.

alecandido commented 2 years ago

But how do you generate NNLO grids? And in particular, how do you generate NNLO grids in the next few weeks/months?

scarlehoff commented 2 years ago

You don't. That's why I'm not entirely sure why is this relevant. In the next few weeks/months the k-factors are "god-given" numbers already in NNPDF and that are applied on top of the NLO grids.

alecandido commented 2 years ago

Indeed, but we'll not have a step to apply K-factors (and we explicitly stated we'll not provide). Are you suggesting we should do only NLO fit for the time being?

scarlehoff commented 2 years ago

No, we can do a NNLO fit but whenever we need a K-factor this is a multiplicative factor applied by validphys so it doesn't matter whether the underlying fktable is pineappl or not.

alecandido commented 2 years ago

Ok, so we simply do not deliver NNLO grids for the time being. I guess if we stick to NLO, mg5 can do everything. (Is there something missing?)

In any case, we won't have runcards even for all NLO we need for a while, so we need to run appl2pine for the time being. But I guess we can keep it in fkutil, and then add here only the runcard, whenever it will be ready.

I'm going to close this, does everyone agree?

cschwan commented 2 years ago

Ok, so we simply do not deliver NNLO grids for the time being.

I don't think we can deliever full NNLO FK tables, but instead we do what NNPDF4.0 already did:

which is approximately NNLO. We should give this a name to not confuse ourselves, how about NLO grids + NNLO evolution, or short (N)NLO FK tables?

In any case, we won't have runcards even for all NLO we need for a while, so we need to run appl2pine for the time being.

We already have that: https://github.com/NNPDF/fktables/blob/main/convert_applgrids.sh.

felixhekhorn commented 2 years ago

I'm going to close this, does everyone agree?

Mmm, I'm not sure - I think, we should still do something here

Ok, so we simply do not deliver NNLO grids for the time being. I guess if we stick to NLO, mg5 can do everything. (Is there something missing?)

true, but this is not the problem here, right? we still need to convert grids (even at NLO, I think, e.g. DIS jets)

In any case, we won't have runcards even for all NLO we need for a while, so we need to run appl2pine for the time being. But I guess we can keep it in fkutil, and then add here only the runcard, whenever it will be ready.

We already have that: https://github.com/NNPDF/fktables/blob/main/convert_applgrids.sh.

I think, I'd like to move that to this repo since, as said here, to me a new theory is "(list of PineAPPL files) + (list of {dataset}.yaml)" and, to me, runcards is responsible to generate the first list ...

furthermore stuff like this should really be spelled out locally and this could be exactly done here ...

cschwan commented 2 years ago

@felixhekhorn I see, now I understand what you're after. That would certainly be convenient, but it's probably not a priority right now (!?).

cschwan commented 2 years ago

In the easiest case we could write a postrun.sh in the corresponding dataset directory which

./rr would have to make sure we have appl2grid and fastNLO (and their dependencies ...) and must make sure not run either mg5 or yadism.

alecandido commented 2 years ago

Yes, so maybe we can just implement a void backend (some sort of noop) and rely on postrun.sh.

Most of the dependencies we already have, I guess we just need meson and to compile them.

cschwan commented 2 years ago

Yes, so maybe we can just implement a void backend (some sort of noop) and rely on postrun.sh.

That sounds good!

Most of the dependencies we already have, I guess we just need meson and to compile them.

You can install meson and ninja using pip, so that should be easy!

alecandido commented 2 years ago

I hope to get back here soon (even if not immediately), and I was looking back even at https://github.com/NNPDF/runcards/issues/124#issuecomment-1033982276.

Do we want to move even appl2pine and fnlo2pine in here? Otherwise I can simply donwload them alongside pineappl, and install from there.

As I wrote above, the runner can be simply a void one, just running postrun.sh, in which there will be a suitable call to the proper converter (provided by rr) on a suitably named grid, that has to be provided by the user.

cschwan commented 2 years ago

Do we want to move even appl2pine and fnlo2pine in here? Otherwise I can simply donwload them alongside pineappl, and install from there.

What do you mean exactly with move?

alecandido commented 2 years ago

Get the code in here.

Of course it would a problem if it gets out of sync with that in pineappl repository, but the issue is that the examples are not packaged with pineappl (even though they might be considered distributed alongside in the GitHub release).

Maybe it's just enough to use the code from pineappl repository, since the examples are officially maintained (and arguably part of the distribution).

alecandido commented 2 years ago

Speaking of, maybe we should start using a PineAPPL release, instead of master. What do you think @cschwan?

cschwan commented 2 years ago

Speaking of, maybe we should start using a PineAPPL release, instead of master. What do you think @cschwan?

I agree! Version 0.5.0 should support everything we need, otherwise I'll make a point release.

cschwan commented 2 years ago

Of course it would a problem if it gets out of sync with that in pineappl repository, but the issue is that the examples are not packaged with pineappl (even though they might be considered distributed alongside in the GitHub release).

I'd like to promote them from examples to proper programs, but appl2pine requires ROOT and the NNPDF-modified APPLgrid, where the former is big and complicated to install. fnlo2pine requires the official fastNLO. So the problem is to package all these dependencies.

I was thinking about integrating the fastNLO converter into the PineAPPL CLI as pineappl import, but I don't know how difficult it is.

felixhekhorn commented 2 years ago

I'd like to promote them from examples to proper programs, but appl2pine requires ROOT and the NNPDF-modified APPLgrid, where the former is big and complicated to install. fnlo2pine requires the official fastNLO. So the problem is to package all these dependencies.

I was thinking about integrating the fastNLO converter into the PineAPPL CLI as pineappl import, but I don't know how difficult it is.

Maybe not inside the pineappl library? (such that you can opt-out) You could do a pineappl_utils (or similar name) alongside pineappl_cli etc. Of course we would prefer also a python binding ...

On the other side appl2pine seems sufficiently complicated, so I wonder if not leave it where it is (of course the complication will become relevant here, as e.g. we would need to download its dependencies ...)

alecandido commented 2 years ago

I agree! Version 0.5.0 should support everything we need, otherwise I'll make a point release.

This I'm going to do in a separate PR (and this way I'll even drop the dependency on pygit2, and a considerable amount of Git overhead).

alecandido commented 2 years ago

I'd like to promote them from examples to proper programs, but appl2pine requires ROOT and the NNPDF-modified APPLgrid, where the former is big and complicated to install. fnlo2pine requires the official fastNLO. So the problem is to package all these dependencies. I was thinking about integrating the fastNLO converter into the PineAPPL CLI as pineappl import, but I don't know how difficult it is.

Maybe not inside the pineappl library? (such that you can opt-out) You could do a pineappl_utils (or similar name) alongside pineappl_cli etc. Of course we would prefer also a python binding ...

On the other side appl2pine seems sufficiently complicated, so I wonder if not leave it where it is (of course the complication will become relevant here, as e.g. we would need to download its dependencies ...)

I'd say that the CLI would be optimal, I'm thinking about how to do it in practice. I'm going to move it to a dedicated PineAPPL discussion, most likely this is not the best place.