workflow modifications for systematics

emanca commented 3 years ago

Here is a suggestion to modify the framework to deal with systematics internally. Let's define a JSON file with entries like this:

"ZZ_TuneCUETP8M1_13TeV-pythia8": {
        "xsec": 16.523, 
        "systematics": {
           "PDFs": {
             "type": ["weight"], // it's a weight variation
             "weight": ["nom"], //don't change weight wrt nominal,
             "variations": ["LHEPdfWeightHess"], // name of the column containing the variations
           },
        }, 
        "dir": {
            "ZZ_TuneCUETP8M1_13TeV-pythia8": ["tree.root"]
         }
},

then a proper module of RDataFrame will be used to make all the Define under the hood and feed the boost histogram helper, until ROOT develops DefinePerSampleand Variate in the common distributions. @sroychow @bianchini what do you think?

emanca commented 3 years ago

This will also help with #68

sroychow commented 3 years ago

@emanca The proposal seems reasonable from my side. But I have a suggestion on the structure. I think we can name the systematics in a common place. And then use only names in the samples.

"ZZ_TuneCUETP8M1_13TeV-pythia8": { "xsec": 16.523, "systematics": ["PDF", "Lumi", ......] "dir": { "ZZ_TuneCUETP8M1_13TeV-pythia8": ["tree.root"] } },

The attributes of PDF/Scale and others can then be defined in a common place. Internally the json parsing will access them and use. This will help us in not repeating the same info for all samples.

emanca commented 3 years ago

yes I totally agree

emanca commented 3 years ago

reopening this in the core framework repository

emanca / wproperties-analysis

workflow modifications for systematics #79