cms-nanoAOD / correctionlib

A generic correction library
https://cms-nanoaod.github.io/correctionlib/
BSD 3-Clause "New" or "Revised" License
16 stars 20 forks source link

Remove corrections from set via correction command #122

Open IzaakWN opened 2 years ago

IzaakWN commented 2 years ago

Some of the full JSONs provided by the TauPOG are quite large because the trigger SFs. A large number of tau analysts won't need these, so it could save some disk space and initial loading time if there's an easy way to remove unneeded data from a JSON.

Removing corrections from correction sets

We could add some type of "remove" and/or "filter" subcommand to correction, to remove some Correction objects from a CorrectionSet in a file. Something like

correction remove tau.json:tau_trigger -f pretty -o tau_small.json
correction remove tau.json:tau_trigger -f pretty > tau_small.json

where the Correction object tau_trigger is removed from the set in the input file tau.json. It would not be difficult to implement using merge as a basis. If you want, I can have a look later and prepare a PR?

Removing keys from corrections

Maybe related to Issues https://github.com/cms-nanoAOD/correctionlib/issues/15 & https://github.com/cms-nanoAOD/correctionlib/issues/38, this could later be extended to remove slices from a Correction object. For example, say you want to remove some specific category key you don't need, like a ID, WP or systematic variation, you would do

correction remove tau.json:sfs -k id=DeepTau -o tau_small.json -f pretty

where the script would recursively go through the data of the Correction object sfs and remove all data that is assigned exclusively to the key DeepTau for pre-defined input variable id. Other example is to remove some unneeded WPs:

correction remove tau.json:tau_trigger -k wp=Loose wp=Tight -o tau_small.json -f pretty

if one only wants to keep Medium.

nsmith- commented 2 years ago

Sorry for the delay in responding, but this is a good addition. Perhaps with regards to the UI we could have a correction filter command that takes in a single json file and then allows a sequence of keep and drop commands, each of which can address either a whole correction or perhaps even a key within a correction. e.g.

correction filter tau.json --keep tau_trigger:wp=Loose --keep tau_trigger:wp=Tight --drop * -o tau_small.json

I'm happy to assist with the implementation. Perhaps we start with just keeping whole corrections?