HEP-KBFI / TallinnNtupleProducer

code, python scripts and config files for producing "plain" Tallinn Ntuples
3 stars 2 forks source link

Transition from nanoAOD-tools to correctionlib #8

Closed veelken closed 2 years ago

veelken commented 2 years ago

I am posting an email from Karl on this subject as reference:

Hi Christian, all,

At Christian's request I compiled some references on correctionlib and added my commentary. It's not clear to me if nanoAOD-tools package is actively maintained anymore, considering that there are lots of issues and PRs still open [1]. For instance, it's not very reassuring that the support for the latest CMSSW version, 12X, has been broken for almost half a year [2].

A likely successor to nanoAOD-tools seems to be correctionlib [3], which takes a JSON schema as input and builds an interface from it for querying SFs and uncertainties. It has both Python and C++ interfaces, and should work with and without CMSSW, which are all good-to-have features. Preliminary JSON schemas are available in [4] and generated documentation in [5]. The only shcemas that are compatible with pre-legacy samples (so NanoAODv7 and earlier) are those provided by the Tau POG; the rest is valid only for the UL samples. If we don't care about the physics at all, we could use UL schemas instead, or generate our own JSON files from the existing corrections that we have. For instance, JetMET POG has a script that converts JEC and JER from tarballed text files into JSON [6].

If I understand the whole concept behind correctionlib correctly, then it does not provide the logic of eg applying JES and JER uncertainties consistently to jets and MET at the same time. I think this may be the only scenario where we have to propagate uncertainties meant for one kind of object (jets) to another kind of object (MET). I currently see two options for moving forward:

1) move all corrections and uncertainties at the level on flat Ntuple production. The real price to pay here is runtime. However, I don't have a good idea for how much longer the jobs may actually take if we start with the flat Ntuple production from vanilla NanoAOD. We would also need to implement the logic of applying JES and JER uncertainties ourselves, but I would consider this piece of code static / one-time investment. On the upside, it would completely eliminate the need to run post-processing, which makes the whole file cataloging business a lot easier to implement, as we wouldn't have to maintain essentially two sample dictionaries at once: one for the Ntuples before they're post-processed, and another after they're post-processed. The code would also become universal, since it could be run in any distributed system that is recognizable by the law FW;

2) continue using nanoAOD-tools, but add a module that interacts with correctionlib and creates new branches based on this information. We probably need to duplicate the functionality of nanoAOD-tools in [7]. We could then drop our fork in favor of upstream.

Whichever option we choose, I think we need to contact XPOG first in order to understand what the plans are for nanoAOD-tools. The correctionlib library is not publicly endorsed yet, AFAICT. I personally prefer the 1st option, but in order to measure how much the runtime blows up we'd have to implement this functionality first.

Please familiarize yourself with the links I sent and let's discuss the options again, either via email or in the meeting on June 10th. For now, let's operate on the assumption that correctionlib does not exists, so to get the prototype working, starting from vanilla NanoAOD Ntuples and finishing with a set of datacards.

Have a nice weekend, Karl

[1] https://github.com/cms-nanoAOD/nanoAOD-tools [2] https://github.com/cms-nanoAOD/nanoAOD-tools/issues/295 [3] https://github.com/cms-nanoAOD/correctionlib [4] https://gitlab.cern.ch/cms-nanoAOD/jsonpog-integration [5] https://cms-nanoaod-integration.web.cern.ch/commonJSONSFs/ [6] https://github.com/cms-jet/JECDatabase/blob/master/scripts/JERC2JSON/createJSONs.py [7] https://github.com/HEP-KBFI/tth-nanoAOD-tools

swertz commented 2 years ago

Hi, you might be interested to try this out for propagating the corrections to the MET: https://gitlab.cern.ch/cp3-cms/CMSJMECalculators

I'm currently working to port that tool to correctionlib as well.

ktht commented 2 years ago

I consider this issue resolved. Relevant commits: