cp3-llbb / CommonTools

Utilities to mass create histograms, flat trees, and more
2 stars 9 forks source link

Need to apply weights on data #136

Closed swertz closed 7 years ago

swertz commented 7 years ago

In order to implement our DY estimation from data, we need to be able to... apply weights on data. However the plotter forbids it: https://github.com/cp3-llbb/CommonTools/blob/master/Factories/templates/Plot.tpl#L6

It's trivial to fix but I'm wondering if we want to abandon this useful safeguard, or if there is another way to bypass this?

OlivierBondu commented 7 years ago

for 'data' itself it is probably safer to keep the safeguard I agree...

here in fact we have not MC nor data but a 'data-driven estimate', so maybe it would be an entire new type altogether to be strict ? Would that be too heavy to support ?

swertz commented 7 years ago

The issue is that we have to apply the weights on data and other background MC, and substract to get the estimate, so we do have to run on data at some point.

But good point, right now I produce the reweighted and non-reweighted histograms on data using the same dataset JSONs, it should be possible to run twice on data, once without reweighting, and once with but removing the "is-data" flag from the JSON. I'll see how that goes...

pieterdavid commented 7 years ago

For my fake-lepton backgrounds I ended up generating another plots.json&plotter that fills the data-driven background histograms, with the appropriate weights (I did get rid of the safeguard earlier, to be honest; all scale factors are 1 for data anyway - actually I could keep it for the "nominal" plotter and only remove it for the "data-driven backgrounds" one). If you want it in the same plotter, you'd have to pass another "tag" to say which weight you want for data (because you can't hardcode that choice - if you have only one weight that's stored in a branch, you can abuse event_weight and is-data, and run once with and once without indeed, but with systematic variations you can quickly have a few different weights), and eventually specify the different weights in the json (weight_MC, weight_DYBkg etc., for each plot because you may need the weights only for plots after a certain selection stage) and introduce a bit more logic in the plotter as well...