NNPDF / theories

Contains all ingredientes (grid, operator card, dataset definition) necessary to regenerate any theory using the pineko structure.
0 stars 0 forks source link

Jet Pineappl grids #5

Closed t7phy closed 1 year ago

t7phy commented 1 year ago

Hi @scarlehoff

I have a few jet theory grids that should be added to theories repo, however there are two new aspects to them:

  1. they are at NNLO
  2. they are obtained by conversion from fastnlo format.

Which folder should these be added to?

cc @enocera

scarlehoff commented 1 year ago

What are these theory grids? Do they correspond to existing datasets? Are they completely new?

I'm putting the grids at NNLO that substitute previous NLO grids in theory 600.

enocera commented 1 year ago

These grids are for (part of) the old ttbar data and for jet data (old and new). I'd suggest to put them in theory 600, then.

t7phy commented 1 year ago

@scarlehoff another thing is that the Jet theories are double differential and therefore they are provided as multiple files, where each file corresponds to one of the bins of one of the differential variables, the question is how should they be named? Should it be <dir name>_<observable name>-BIN1.pineappl.lz4 or <dir name>_<observable name>_BIN1.pineappl.lz4 ? (where dir name and observable name are the same as those in buildmaster)

scarlehoff commented 1 year ago

I would keep the names they already have in this repository since the individual names of the grids do not matter and there are already many theories that depend on those names.

t7phy commented 1 year ago

I don't mean to rename the ones that are already there, what I mean is that the new ones that I add, I will follow the same names as we agreed for the new buildmaster, the question is whether there should be - or _ before the BINX ? Is that what you mean, that it doesn't matter?

scarlehoff commented 1 year ago

The name of the grids are only used in the theory part of the new data implementation and they usually have either the name of the dataset + BINX

https://github.com/NNPDF/theories/blob/main/data/yamldb/400/ATLAS_DY_2D_8TEV_LOWMASS.yaml

but some times they even contain arxiv references https://github.com/NNPDF/theories/blob/main/data/yamldb/400/ATLAS_WM_JET_8TEV_PT.yaml

tbh, it doesn't matter (to me) but if I have to choose, just follow whatever is close to the names that are already there.

felixhekhorn commented 1 year ago

but some times they even contain arxiv references

Please avoid arXiv numbers, or to be more specific please avoid a dot . - this way we can split more easily by suffix

scarlehoff commented 1 year ago

@t7phy what is the status of this?

It would be good to have these in light of #11 since if the difference are small enough with respect to the ones we have it might nice to use them also for MHOU.

Are they converted from plougshare?

t7phy commented 1 year ago

@t7phy what is the status of this?

It would be good to have these in light of #11 since if the difference are small enough with respect to the ones we have it might nice to use them also for MHOU.

Are they converted from plougshare?

They were all converted however I will have to reconvert them as previously the metadata was not copied from fastnlo grids, the feature was recently added in pineappl and since them I have been very caught up with some work. I can have them all ready next week. Shall I just push to the master in the 600 directory or will I need to open a PR?

scarlehoff commented 1 year ago

Open a PR please, so that we can also do some tests and keep track of the dicussion.

t7phy commented 1 year ago

@scarlehoff Some grid files are above the 100 MB threshold, how do I go about adding them? git lfs?

scarlehoff commented 1 year ago

Yes... I was hoping this wouldn't happen so soon (#1 is becoming a prioririty now...)

(just to make sure, you mean the grid and not the fktable? If you optimize the grid with pineappl does it go below 100 mb?)

t7phy commented 1 year ago

yes, the grid. I am not sure about optimizing, I just converted from fnlo to pineappl, however some of the files are ~200-250 mb so I doubt they could go below 100

scarlehoff commented 1 year ago

With the pineappl cli you can do:

pineappl optimize grid.pineappl.lz4 optimized_grid.pineappl.lz4

Optimising might be good even if it doesn't change much of course, but indeed I thought it was a little bit over the threshold. If we are already beyond 200 MB we definitely need a solution. Use git-lfs for now and I'll try to prioritise having somewhere we can trust where to put grids.