cms-nanoAOD / nanoAOD-tools

Tools for working with NanoAOD (requiring only python + root, not CMSSW)
42 stars 326 forks source link

nanoAOD-tools

A minimal set of tool for working with NanoAODs (with dependencies on only python + root, not on the CMSSW framework)

Please note that, starting with CMSSW_13_3_0 (with backports for the coming 13_0_16 and 13_1_2), the framework part of NanoAODTools is maintained as a CMSSW package, in PhysicsTools/NanoAODTools.

This repository and the instructions below are still relevant only for older CMSSW releases.

Checkout instructions: standalone

You need to setup python 2.7 and a recent ROOT version first.

git clone https://github.com/cms-nanoAOD/nanoAOD-tools.git NanoAODTools
cd NanoAODTools
bash standalone/env_standalone.sh build
source standalone/env_standalone.sh

Repeat only the last command at the beginning of every session.

Please never commit neither the build directory, nor the empty init.py files created by the script.

Checkout instructions: CMSSW (CMSSW 12X and below)

cd $CMSSW_BASE/src
git clone https://github.com/cms-nanoAOD/nanoAOD-tools.git PhysicsTools/NanoAODTools
cd PhysicsTools/NanoAODTools
cmsenv
scram b

General instructions to run the post-processing step

The script to run the post-processing step is scripts/nano_postproc.py.

The basic syntax of the command is the following:

python scripts/nano_postproc.py /path/to/output_directory /path/to/input_tree.root

Here is a summary of its features:

Please run with --help for a complete list of options.

How to write and run modules

It is possible to import modules that will be run on each entry passing the event selection, and can be used to calculate new variables that will be included in the output tree (both in friend and full mode) or to apply event filter decisions.

We will use python/postprocessing/examples/exampleModule.py as an example. The module definition file, containing a simple constructor

   exampleModuleConstr = lambda : exampleProducer(jetSelection= lambda j : j.pt > 30)

should be imported using the following syntax:

python scripts/nano_postproc.py outDir /eos/cms/store/user/andrey/f.root -I PhysicsTools.NanoAODTools.postprocessing.examples.exampleModule exampleModuleConstr

Let us now examine the structure of the exampleProducer module class. All modules must inherit from PhysicsTools.NanoAODTools.postprocessing.framework.eventloop.Module.

Keep/drop branches

See the effect of keep/drop instructions by running:

python scripts/nano_postproc.py outDir /eos/cms/store/user/andrey/f.root -I PhysicsTools.NanoAODTools.postprocessing.examples.exampleModule exampleModuleConstr -s _exaModu_keepdrop --bi scripts/keep_and_drop_input.txt --bo scripts/keep_and_drop_output.txt

comparing to the previous command (without --bi and --bo). The output branch created by exampleModuleConstr produces the same result in both cases. But this one drops all other branches when creating output tree. It also runs faster.

The event interface, defined in PhysicsTools.NanoAODTools.postprocessing.framework.datamodule, allows to dynamically construct views of objects organized in collections, based on the branch names, for instance:

electrons = Collection(event, "Electron")
if len(electrons)>1: print electrons[0].someVar+electrons[1].someVar
electrons_highpt = filter(lambda x: x.pt>50, electrons)

and this will access the elements of the Electron_someVar, Electron_pt branch arrays. Event variables can be accessed simply by event.someVar, for instance event.rho.

The output branches should be filled calling the fillBranch(branchname, value) method of wrappedOutputTree. value should be the desired value for single-value branches, an iterable with the correct length for array branches. It is not necessary to fill the lenVar branch explicitly, as this is done automatically using the length of the passed iterable.

mht producer

Now, let's have a look at another example, python/postprocessing/examples/mhtjuProducerCpp.py, file. Similarly, it should be imported using the following syntax:

python scripts/nano_postproc.py outDir /eos/cms/store/user/andrey/f.root -I PhysicsTools.NanoAODTools.postprocessing.examples.mhtjuProducerCpp mhtju

This module has the same structure of its producer as exampleProducer, but in addition it utilizes a C++ code to calculate the mht variable, src/mhtjuProducerCppWorker.cc. This code is loaded in the __init__ method of the producer.