cms-nanoAOD / cmssw

CMS NanoAOD software integration repository
http://cms-sw.github.io/
Apache License 2.0
3 stars 10 forks source link

Store LHA ID of the original PDF set #590

Open ktht opened 2 years ago

ktht commented 2 years ago

The LHA ID of the PDF set stored in LHEScaleWeight branch does not necessarily correspond to the original PDF set that was used in the production of the sample. This can be problematic if an analyst wants to reweight to a different PDF set (which implies the knowledge of the original PDF set) that is not in the priority list or if the list is not kept up-to-date (as was the case with the pre-legacy NanoAOD Ntuples). Furthermore, it's also useful from the perspective of documenting and citing the correct sources.

One way to access the original LHA IDs is to do the following:

const lhef::HEPRUP &heprup = lheInfo->heprup();
const int original_lhaid = heprup.PDFSUP.first; // same as heprup.PDFSUP.second

And then update pdfDoc. It gives the correct LHA ID in 95% of the cases, but in some rare instances the stored value is simply -1. I think that it's a separate problem, though, that can be solved with GEN.

I can prepare a PR in the upstream if needed.

ktht commented 2 years ago

There's a mistake in my original post: in order to reweight to an alternative PDF set, it's not required that we know the original PDF that the sample was produced with if the LHEPdfWeight branch already stores weights for some other PDF set. This is because the nominal PDF weight (LHEPdfWeight[0]) is proportional to PDFs(other)/PDFs(original), so it's sufficient to know LHA ID of the PDF set for which there are weights available. The reweighting can be done by simply assigning LHEPdfWeight[0] * PDFs(alternative) / PDFs(other) to every event, where PDFs are computed with LHAPDF.

That said, I still think that storing this information somewhere in NanoAOD is useful for documentation purposes, especially given that the computational overhead is negligible.

vlimant commented 1 year ago

@kdlong is this in https://github.com/cms-sw/cmssw/pull/32167 too ?