Added CDFZRAP_NEW data set

enocera commented 3 years ago

This PR extends the database to the generation of a fixed version of the CDF Z rapidity distributions.

cschwan commented 3 years ago

I think the *.root file names are reversed. I see that

CDFZRAP -> CDF_ZRAP_MCgrid_500M_last_two_bins_combined.root
CDFZRAP_NEW -> CDF_ZRAP_MCgrid_500M.root

which doesn't match the bin numbers in the database and don't match the number of K factors.

enocera commented 3 years ago

@cschwan Thanks for checking. This is strange. Let me try to proceed one step at a time. 1) The fixed implementation of the CDF Z rapidity measurement should have 28 points instead of 29. Do you agree? 2) It seems to me that the data/theory comparison produced here https://vp.nnpdf.science/cj0tf26KSq6KLJuBj7IAsA== after fixing the data set has indeed 28 points instead of 29. The data central values seem to match those in v4. Do you agree? 3) The DATA_ file contains 28 points as does the K factor file. Do you agree? 4) Now, the grid applgrids/CDFZRAP_NEW/CDF_ZRAP_MCgrid_500M_last_two_bins_combined.root (that is used to produce the FK table) has 29 bins, the last bin being an overflow bin returning -nan. Do you agree? In the apfelcomb database, I removed the last bin, so that the FK table has only 28 bins. The grid applgrids/CDFZRAP_NEW/CDF_ZRAP_MCgrid_500M.root has 29 bins as well, which all retur a finite number. Note that this grid is duplicated from the CDFZRAP into the CDFZRAP_NEW folder just to allow the user to produce the grid with combined bins, but it is not used at any time to produce the CDFZRAP_NEW FK table.

Unless I'm missing something, I don't think that there's a mismatch (and if there were one, I wouldn't be able to successfully produce the validphys data/theory comparison in the first place). Maybe I didn't understand what you meant?

cschwan commented 3 years ago

@enocera :

Yes. But am I right to assume that you call the fixed measurement (the data taken from the v4 revision of the arxiv paper) CDFZRAP_NEW?
Yes, it does, but I wonder whether the theory prediction of the last bin is correct. The database for process 501, which should be the newest dataset, points to the old .root file (where the last bin is cut away)
Yes
The last bin of the updated .root contains a -NaN, which must be and is cut away. But as I said in 2), I think that the given root file is wrong. To repeat this is other words, I don't understand why the line for dataset 500 changes at all. Did you rename the APPLgrid files?

What I'm worried about is there is a mismatch, which can only show up in the last bin, since the theory prediction are the same otherwise.

enocera commented 3 years ago

@cschwan OMG! You're right! THERE IS a mismatch in the abpfelcomb database, the .root file in 500 should be in 501 and the other way around! Thanks so much! Let me fix this!

enocera commented 3 years ago

Everything works by accident because apfelcomb checks only the number of bins, but not the bin kinematics!

enocera commented 3 years ago

Well, 501 works, but not 500.

cschwan commented 3 years ago

Great! I was very surprised by the fit of the last bin; before the fix the data is too large, and now it was too small. I hope that with the right theory prediction fit and data are much closer.

NNPDF / apfelcomb

Added CDFZRAP_NEW data set #79