NNPDF / pinecards

Runcards needed to generate PineAPPL grids for NNPDF processes
3 stars 1 forks source link

Add PDF4LHC21 cards #121

Closed cschwan closed 1 year ago

cschwan commented 2 years ago

@Radonirinaunimi here's first shot at the distribution @juanrojochacon wants for PDF4LHC21. However, the integration dies after step 0 because not enough points pass through the cuts:

ERROR: NOT ENOUGH POINTS PASS THE CUTS. RESULTS CANNOT BE TRUSTED. LOOSEN THE GENERATION CUTS, OR ADAPT SET_TAU_MIN() IN SETCUTS.F ACCORDINGLY.

What I don't understand is that I've already increased the value of tau_min to 1 TeV. @marcozaro do you have any idea what I can do?

Radonirinaunimi commented 2 years ago

@Radonirinaunimi here's first shot at the distribution @juanrojochacon wants for PDF4LHC21. However, the integration dies after step 0 because not enough points pass through the cuts:

ERROR: NOT ENOUGH POINTS PASS THE CUTS. RESULTS CANNOT BE TRUSTED. LOOSEN THE GENERATION CUTS, OR ADAPT SET_TAU_MIN() IN SETCUTS.F ACCORDINGLY.

What I don't understand is that I've already increased the value of tau_min to 1 TeV. @marcozaro do you have any idea what I can do?

Thanks @cschwan for having had an attempt in generating this grid. I haven't started generating the actual grid yet as I am still trying to sort out some hiccups with poetry (related to pip caching) when installing on the nikhef cluster. I will open an issue tomorrow if I don't manage to find a way around this now. What I can do in the meantime is to run the card that you just pushed on my local machine in which everything works fine.

cschwan commented 2 years ago

@marcozaro here's the full log, with print *, mll, obs statements to check when points are fed to anaysis.f: log.txt

juanrojochacon commented 2 years ago

This is very odd @cschwan @Radonirinaunimi - is this a grid problem or also happens with standalone mg5? If you reduce tau_min all the way down to the kinematic limit do you still get this problem? I think that provided we generate events somewhat above the Z pole we should be fine with statistics ...

cschwan commented 2 years ago

@juanrojochacon this is a problem in step 0, where no grid has been generated yet (that only happens in step 1), so a general problem. What I don't understand is why so few points end up in analysis.f, as you can see in the logfile I sent to Marco. I have a few ideas I'll try tomorrow.

Radonirinaunimi commented 2 years ago

@cschwan If I use the run card PDF4LHC_DY_13_TEV_21_PHENO, what I get are the following:

cschwan commented 2 years ago

@Radonirinaunimi yes, thanks for reproducing the error!

Radonirinaunimi commented 2 years ago

@Radonirinaunimi yes, thanks for reproducing the error!

Is there something I should check/investigate in the meantime?

cschwan commented 2 years ago

@Radonirinaunimi not at this point, thanks!

cschwan commented 2 years ago

I don't understand exactly why Madgraph5 crashed, but since I've imposed a minimum invariant mass cut on the lepton pair, it seems to correctly set the minimum invariant of the partonic process. I've updated the parameters, so this should be ready for a production run. @Radonirinaunimi would you please take care of that? Run it with theory 200, in the following way:

./rr run PDF4LHC_DY_13_TEV_21_PHENO theories/theory_200_1.yaml
Radonirinaunimi commented 2 years ago

I don't understand exactly why Madgraph5 crashed, but since I've imposed a minimum invariant mass cut on the lepton pair, it seems to correctly set the minimum invariant of the partonic process. I've updated the parameters, so this should be ready for a production run. @Radonirinaunimi would you please take care of that? Run it with theory 200, in the following way:

./rr run PDF4LHC_DY_13_TEV_21_PHENO theories/theory_200_1.yaml

Thanks! I will try this now.

juanrojochacon commented 2 years ago

Good, very nice. Indeed, the mll generation cut is the one I proposed originally, so I thought we were already imposing it

juanrojochacon commented 2 years ago

anyway, provided it works we should be fine. We can use mll > 200 Gev or something. btw @Radonirinaunimi please generate grids also for the CC process (everything else identical, so adapting the runcard would be trivial)

Radonirinaunimi commented 2 years ago

Good, very nice. Indeed, the mll generation cut is the one I proposed originally, so I thought we were already imposing it

@juanrojochacon I think @cschwan is stating the opposite. As far as I understand Christopher's last changes (which indeed work on my local computer, I will send to the cluster later), it works upon removing the minimum cut (there might be a typos in his message above).

juanrojochacon commented 2 years ago

@cschwan says " since I've imposed a minimum invariant mass cut on the lepton pair, it seems to correctly set the minimum invariant of the partonic process". Before we has imposing a different cut I think

cschwan commented 2 years ago

The problem was that imposing both a cut on the partonic centre-of-mass energy (Madgraph5 calls it tau_min) and the lepton invariant mass mll made Madgraph5 crash. At LO these two variables are the same, and Madgraph5 correctly calculates tau_min from the mll cut.

cschwan commented 2 years ago

@juanrojochacon the fact that the invariant mass of the lepton-neutrino system isn't a collider observable isn't a problem for this exercise, right?

juanrojochacon commented 2 years ago

Not at all, this is just for a "theory" exercise, we don't want to reproduce any exp measurement. And this way it is easier to compare the NC and CC calculations, and to connect the results with the possible issue of negative PDFs at large-x

cschwan commented 2 years ago

Alright, thanks for confirming that.

cschwan commented 2 years ago

@marcozaro could you please have a look at the charged current process and tell us what's wrong with it? We have a large invariant mass cut on the lepton-neutrino system, but Madgraph5 complains about too few points coming through the cuts. We've adjusted tau_min, but that doesn't seem to help.

marcozaro commented 2 years ago

Hi, running ./run.sh PDF4LHC_WP_13_TEV_21_PHENO

I get this error: patching file SubProcesses/setscales.f Hunk #1 succeeded at 540 (offset 13 lines). Hunk #2 succeeded at 592 (offset 13 lines). /home/marcozaro/MadGraph/runcards/PDF4LHC_WP_13_TEV_21_PHENO-20220111170402 Error: unrecognised cut: mmlnumin

Any idea? Cheers,

Marco

cschwan commented 2 years ago

@marcozaro ./run.sh is outdated and we don't use it anymore. Please use ./rr instead, see also the README. You should run

  1. ./rr install first to install all Python dependencies and
  2. ./rr run PDF4LHC_WP_13_TEV_21_PHENO theories/theory_200_1.yaml to generate the grid.
Radonirinaunimi commented 2 years ago

In the meantime (@juanrojochacon) below are some results for the NC. I did not include the CT18_PDF4LHC21 set as I think the updated grid is not on the server yet.

In this PR (https://github.com/NNPDF/pineapplgrids/pull/26) you can also find pheno results using the published PDF sets (NNPDF4.0, CT18, MSHT20).

Pheno plots:

PDF4LHC_DY_13_TEV_21_PHENO-integrated PDF4LHC_DY_13_TEV_21_PHENO-internal PDF4LHC_DY_13_TEV_21_PHENO-global

Detailed stats:

Results with PDF4LHC21_nnlo_mc:

bin    Mll      dsig/dMll   neg unc pos unc
---+----+----+-------------+-------+-------
  0 1000 1500  8.8348752e-6  -1.73%   1.76%
  1 1500 2000  1.2160266e-6  -2.19%   2.00%
  2 2000 3000  1.4455877e-7  -2.65%   2.26%
  3 3000 4000  8.8164697e-9  -3.37%   2.67%
  4 4000 5000 6.3709040e-10  -4.31%   3.27%
  5 5000 7000 1.3187630e-11  -9.02%   7.62%

Results with NNPDF31_nnlo_as_0118_PDF4LHC21:

bin    Mll      dsig/dMll    neg unc pos unc
---+----+----+--------------+-------+-------
  0 1000 1500   8.7770143e-6  -1.73%   1.75%
  1 1500 2000   1.1991972e-6  -2.21%   2.00%
  2 2000 3000   1.4013975e-7  -2.68%   2.29%
  3 3000 4000   7.9208905e-9  -3.52%   2.82%
  4 4000 5000  4.0874324e-10  -5.66%   4.36%
  5 5000 7000 -2.9488025e-11   2.34%  -2.26%

Results with MSHT20nnlo_as118_PDF4LHC21:

bin    Mll      dsig/dMll   neg unc pos unc
---+----+----+-------------+-------+-------
  0 1000 1500  9.0105540e-6  -1.71%   1.74%
  1 1500 2000  1.2580997e-6  -2.16%   1.98%
  2 2000 3000  1.5257612e-7  -2.62%   2.23%
  3 3000 4000  9.8157808e-9  -3.26%   2.56%
  4 4000 5000 8.1521783e-10  -3.87%   2.90%
  5 5000 7000 3.8758066e-11  -4.73%   3.62%
juanrojochacon commented 2 years ago

Thanks, this looks good. Could you please produce a histogram of the replicas of PDF4LHC21 in the right-most bin? To understand whether or not we have negative replicas there. Thanks!

I will add the NC results to the paper, and eventually update them with the CC ones.

juanrojochacon commented 2 years ago

yet I realise that I don't understand the numbers. In the table below, the error is scale or PDF or both? The PDF error seems gigantic in the right most bin right? Maybe something I am missing @Radonirinaunimi ?

Results with PDF4LHC21_nnlo_mc:

bin Mll dsig/dMll neg unc pos unc ---+----+----+-------------+-------+------- 0 1000 1500 8.8348752e-6 -1.73% 1.76% 1 1500 2000 1.2160266e-6 -2.19% 2.00% 2 2000 3000 1.4455877e-7 -2.65% 2.26% 3 3000 4000 8.8164697e-9 -3.37% 2.67% 4 4000 5000 6.3709040e-10 -4.31% 3.27% 5 5000 7000 1.3187630e-11 -9.02% 7.62%

cschwan commented 2 years ago

@juanrojochacon by default pineappl convolute prints the scale uncertainties, which seem OK. Note that the MC uncertainties for the last bin are a bit large (see https://github.com/NNPDF/pineapplgrids/pull/26#issue-818408470, in the first table in the sigma column given in per cent), but this is unavoidable if we generate them together with the other bins; we could start a dedicated run for this bin, however.

The PDF uncertainties are shown in the plots, but you're probably interested in the numerical values. @Radonirinaunimi simply run pineappl pdf_uncertainties PDF4LHC_WP_13_TEV_21_PHENO.pineappl.lz4 PDFSET for each PDFSET to print them.

juanrojochacon commented 2 years ago

yes, I want the numbers for the PDF errors, and ideally also the histogram to show to which extent the distribution is gaussian. But the list of cross-sections for the 900 replicas in the last bin is also sufficient, whatever is simpler.

How is the central value computed? It is the central replica? So this only works if the distribution is gaussian I fear?

cschwan commented 2 years ago

Every number given above should use the central replica, but pineappl pdf_uncertainty (for which @Radonirinaunimi will show the output shortly I suppose) uses LHAPDF's PDFSet::uncertainty, which doesn't use the central replica but instead performs the average over all other replicas and also calculates the uncertainty from them.

@Radonirinaunimi to evaluate each replica for the last bin, do the following:

  1. Get the LHAPDF ID as a number, for instance NNPDF40_nnlo_as_01180 is 331100
  2. Run convolute for each of the replicas, for the last bin (-b 5):
for i in {1..100}; do
    pineappl --silence-lhapdf convolute PDF4LHC_DY_13_TEV_21_PHENO.pineappl.lz4 $(( $i + 331100 )) -b 5
done
Radonirinaunimi commented 2 years ago

Every number given above should use the central replica, but pineappl pdf_uncertainty (for which @Radonirinaunimi will show the output shortly I suppose) uses LHAPDF's PDFSet::uncertainty, which doesn't use the central replica but instead performs the average over all other replicas and also calculates the uncertainty from them.

I am generating the PDF uncertainties now. For some reasons, it always take very long for the PDF4LHC21 grids to load all the members.

@Radonirinaunimi to evaluate each replica for the last bin, do the following:

1. Get the LHAPDF ID as a number, for instance `NNPDF40_nnlo_as_01180` is `331100`

2. Run convolute for each of the replicas, for the last bin (`-b 5`):
for i in {1..100}; do
    pineappl --silence-lhapdf convolute PDF4LHC_DY_13_TEV_21_PHENO.pineappl.lz4 $(( $i + 331100 )) -b 5
done

What if the set does not have a LHAPDF ID?

cschwan commented 2 years ago

What if the set does not have a LHAPDF ID?

In that case I think you can edit ${prefix}/share/LHAPDF/pdfsets.index and add an index yourself.

marcozaro commented 2 years ago

Ok, but i have the following issue (even after upgrading pip) Cheers, M

WARNING: You are using pip version 20.3.3; however, version 21.3.1 is available. You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command. Traceback (most recent call last): File "./rr", line 113, in subprocess.run("poetry install".split()) File "/usr/lib/python3.7/subprocess.py", line 472, in run with Popen(*popenargs, **kwargs) as process: File "/usr/lib/python3.7/subprocess.py", line 775, in init restore_signals, start_new_session) File "/usr/lib/python3.7/subprocess.py", line 1522, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'poetry': 'poetry'

On 12 Jan 2022, at 08:13, Christopher Schwan @.***> wrote:

@marcozaro https://github.com/marcozaro ./run.sh is outdated and we don't use it anymore. Please use ./rr instead, see also the README. You should run

./rr install first to install all Python dependencies and ./rr run PDF4LHC_WP_13_TEV_21_PHENO theories/theory_200_1.yaml to generate the grid. — Reply to this email directly, view it on GitHub https://github.com/NNPDF/runcards/pull/121#issuecomment-1010711791, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHECN5E4O7NRUQMYJZXZ6BTUVUSYFANCNFSM5LIDU7EA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you were mentioned.

Radonirinaunimi commented 2 years ago

@juanrojochacon @cschwan Here are the numbers for the PDF uncertainties & the results of each replica for the last bin:

PDF Uncertainties:

(running pineappl pdf_uncertainty PDF4LHC_DY_13_TEV_21_PHENO.pineappl.lz4 PDFSET):

Prediction per replicas for last bin:

juanrojochacon commented 2 years ago

Thanks! Indeed a few replicas become negative. Let me play a bit with the results and I will let you know if I have any question

juanrojochacon commented 2 years ago

Thanks again @cschwan @Radonirinaunimi for producing these numbers. I have processed them and everything looks as expected: around 15% of the replicas are negative, but the median and 68%CL ranges are unaffected. So this is reassuring.

If the same holds for the CC process, then we are done.

Note that if imposing cuts is too difficult just generate events for pp => W, this should be sufficient and should work out of the box right?

cschwan commented 2 years ago

@juanrojochacon if we generate CC without cuts the invariant mass will be close to the W-boson mass where the PDFs are typically fine, so I don't think that'll give us much.

juanrojochacon commented 2 years ago

I see. Can you try generating W => u dbar? This way there is no neutrino, and this is really identical to NC then ...

marcozaro commented 2 years ago

Hi, I am investigating the issue of the invariant mass in CC. Looks like the born momenta are always generated with a W boson close to mass shell, despite the changes in setcuts. I will discuss with Rikkert, who knows a lot about the phase-space, and let you know.

On 12 Jan 2022, at 16:30, Christopher Schwan @.***> wrote:

@juanrojochacon https://github.com/juanrojochacon if we generate CC without cuts the invariant mass will be close to the W-boson mass where the PDFs are typically fine, so I don't think that'll give us much.

— Reply to this email directly, view it on GitHub https://github.com/NNPDF/runcards/pull/121#issuecomment-1011164684, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHECN5DAXQGK7NV5WFPUXA3UVWNC3ANCNFSM5LIDU7EA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you were mentioned.

marcozaro commented 2 years ago

@juanrojochacon if we generate CC without cuts the invariant mass will be close to the W-boson mass where the PDFs are typically fine, so I don't think that'll give us much.

Given that you have rather large bins, you may give it a try with setting a very high accuracy target…

marcozaro commented 2 years ago

I see. Can you try generating W => u dbar? This way there is no neutrino, and this is really identical to NC then ...

no, don't do so, as you will also have t-channels and a lot of other stuff around (essntially, dijet at order alpha^2)

marcozaro commented 2 years ago

Rikkert suggested the following:

_The easiest is to remove the cut from cuts.f. Then use the bias-function (in cuts.f) to enhance the large mass region and apply the cut in the analysis file.

For fixed-order runs this is only working since 3.3.1 (I think). For event generation this has been around longer already._

We need to check that the bias is properly accounted for with PineAPPL...

Cheers,

Marco

cschwan commented 2 years ago

@marcozaro the bias_weight_function sounds very useful, I'll look into that.

cschwan commented 2 years ago

@marcozaro I'm currently looking into this with @Radonirinaunimi and we're wondering whether the bias (we've set it to M_lv^3, the invariant mass of the charged lepton and the neutrino cubed) is correctly removed from the PineAPPL grid. Could you check that?

Radonirinaunimi commented 2 years ago

@cschwan So, there is something wrong with 921dd2e. The run never finished so I checked again and the whole folder is taking up a whopping 265GB of the disk. The only thing I changed was the accuracy set req_acc_FO 0.0001. I should mention that this was at the level of step 3.

cschwan commented 2 years ago

Did I forget to comment out some write/print statements? Have a look at size of the files, please.

Radonirinaunimi commented 2 years ago

Did I forget to comment out some write/print statements? Have a look at size of the files, please.

There was one write statement that was copied from cuts.f to the patch. I commented it out and ran again the process but right now still waiting.

juanrojochacon commented 2 years ago

Hi @Radonirinaunimi I would like to get the same numbers as per above

Prediction per replicas for last bin: PDF4LHC21_nnlo_mc: PDF4LHC21_nnlo_mc.txt

but for the rest of the rapidity bins. Would that be possible? This way we can show that the issue of cross-section negativity becomes only important at the highest mass bin.

Thanks!!!

juanrojochacon commented 2 years ago

also, can you remind me what are the units of the cross-section?

Radonirinaunimi commented 2 years ago

Hi @Radonirinaunimi I would like to get the same numbers as per above

Prediction per replicas for last bin: PDF4LHC21_nnlo_mc: PDF4LHC21_nnlo_mc.txt

but for the rest of the rapidity bins. Would that be possible? This way we can show that the issue of cross-section negativity becomes only important at the highest mass bin.

@juanrojochacon Yes! That is possible. I will produce them and post here the results.

also, can you remind me what are the units of the cross-section?

For the differential distribution, the cross-section is expressed in pb/GeV.

Radonirinaunimi commented 2 years ago

@juanrojochacon, here are the results of PDF4LHC21_nnlo_mc_${BIN}.txt:

BIN Pred/replica
0 PDF4LHC21_nnlo_mc_0.txt
1 PDF4LHC21_nnlo_mc_1.txt
2 PDF4LHC21_nnlo_mc_2.txt
3 PDF4LHC21_nnlo_mc_3.txt
4 PDF4LHC21_nnlo_mc_4.txt
5 PDF4LHC21_nnlo_mc_5.txt
juanrojochacon commented 2 years ago

perfect many thanks. Those of PDF4LHC21_nnlo_mc_5.txt are the ones you produced already right?