Closed OlivierBondu closed 7 years ago
IMO there's not much we can do, since the SF is applied on MC, without notion of runs. The only solution is to store all the possible scale factors (like we do for the various WP), and then randomly apply a given scale factor accordingly to the integrated luminosity of the run range.
We can maybe automatize this on the framework side, but is this what we want?
you mean this may be more appropriately handled on the plotIt side ?
I was more specifically thinking during the histFactory phase, where we have all the informations. But maybe directly into the framework would be doable, meaning only one SF to deal with instead of a bunch...
mmm, in fact probably at a later phase could be better: if we are asked to do some plots per run-period, we would have to redo a prod for each ?
Ah yes indeed...
so we need two things:
First one we already have
I am not sure I got the point here: if we decide to deal with run dependent SF in the framework (which seems to be the easiest IMO) everything would be transparent at the histfactory stage. Even if we are required to run on a specific period, the SF would automatically be the correct one, without to redo a prod, just applying a selection on run/evt number. What did I miss?
There's no run number of MC, so you have to pick randomly a scale factor, ensuring you correctly probe the SF space accordingly to the integrated luminosity of each run period. This is done during the production, where you assume you know this luminosity. If you want only one period, this assumption is no longer valid.
Idea for 2: can we use the lumi block & event number (or a function of these - assuming that they are related to the seed(s) used to generate the event) to seed the random number generator for each event? That would be an elegant way to keep the "random" numbers reproducible. If the relative lumi numbers are reliable enough (I would expect lumi calibrations to mostly change a common scale factor - at least as long as the lumi measurement does not strongly depend on pileup etc.), at the framework stage is also an option (as long as you save the run block or "virtual run number", you are still able to make per-block plots). Personally I would prefer to store all SF in any case, then you can also make per-period plots with the full MC statistics if you want, or even change your mind about the lumi numbers (if it's less work to implement that in histFactory than rerunning the productions at that point ;-) ).
I agree that storing everything and choosing the correct SF in histFactory is more reliable :)
I really like the lumi block + event number approach to seed the generator number. This would be a way to reliably get the same random number to access a scale factor, but also its uncertainty :)
Additional note: the same run-dependance feature will be needed for trigger efficiencies... eg from HWW: https://twiki.cern.ch/twiki/bin/view/CMS/HWW2016TriggerAndIdIsoScaleFactorsResults#Trigger_Efficiency_for_Single_Mu
Given we have now the 'weighted SF' is this still needed ? Maybe for @BrieucF if 'progressive unblinding' is still planned, but I guess for this use case you can still have a dedicated 'weighted SF' file corresponding to the lumi under study
IMO this can be closed thanks to https://github.com/cp3-llbb/Framework/pull/216, do anyone objects?
No objection. I don't plan to run MIS on 2016 data in a near future.
needed if we want to use HWW SFs for 80X