cms-analysis / CombineHarvester

CMSSW package for the creation, editing and analysis of combine datacards and workspaces
cms-analysis.github.io/CombineHarvester/
15 stars 180 forks source link

Tool to gauge impact of individual parameters on process yields #268

Open pkausw opened 2 years ago

pkausw commented 2 years ago

This PR introduces a new function RateEvolution to the CombineHarvester. The function loops through the parameters in a given RooFitResults object and sets the respective parameter in the workspace to the post-fit value while leaving the other parameters at the pre-fit values. For each parameter, the yield of the processes is evaluated and stored in a map of the format

map[parname] = process_yield

where parname is the name of the parameter and process_yield is the yield of all processes in the harvester instance. The PR also adds the bindings for the python interface, where the map is accessible as a python dictionary.

This information can be used to gauge the impact of individual parameters on process yields. An example code snippet would be

bins = harvester.bin_set()
for b in bins:
    bin_dict = {}
    for procs in process_list:
        proc_harvester = harvester.cp().bin([b]).process([procs])
        if len(proc_harvester.process_set()) == 0:
            print("Could not load processes '{}' for bin '{}'".format(procs, b))
            continue
        bin_dict[procs] = proc_harvester.RateEvolution(fit)

where process_list is the list of processes that are considered. The resulting nested dictionary can be used to create e.g. a 2D Histogram of the yield changes for the processes that are introduced by a given parameter, see this example: example This functionality helped to further analyze and understand the statistical model in the HIG-19-011 and might also be interesting for other analyses.