cms-analysis / CombineHarvester

CMSSW package for the creation, editing and analysis of combine datacards and workspaces
cms-analysis.github.io/CombineHarvester/
15 stars 180 forks source link

ValidateDatacards.py looks for TH1 not required by datacard. #285

Open rymuelle opened 1 year ago

rymuelle commented 1 year ago

When running ValidateDatacards.py datacard.txt, I get:

*******************************************************************************
Context: Function ch::GetClonedTH1 at 
  /afs/cern.ch/work/r/rymuelle/public/nanoAODzPrime/higgscombine/CMSSW_10_2_13/src/CombineHarvester/CombineTools/src/TFileIO.cc:24
Problem: TH1 SR1-sys_0_nominal-0 not found in 2016/2016_shapes_df_input.root
*******************************************************************************
Please report issues at
  https://github.com/cms-analysis/CombineHarvester/issues
*******************************************************************************

However, the datacard in question does not require "SR1-sys_0_nominal-0", and I am unsure what sort of pattern ValidateDatacard is looking for that would cause it to look for this TH1.

Datacard in question:

Combination of name0=2016/2016_SR1_BFFZprimeToMuMu_fit_M_125_dbs0p5.txt  name1=2016/2016_SR2_BFFZprimeToMuMu_fit_M_125_dbs0p5.txt
imax 2 number of bins
jmax 1 number of processes minus 1
kmax 13 number of nuisance parameters
----------------------------------------------------------------------------------------------------------------------------------
shapes *           name0       2016/2016_shapes_df_input.root SR1-sys_0_nominal-$PROCESS SR1-$SYSTEMATIC-$PROCESS
shapes background  name0       2016/2016_shapes_df_input.root SR1-sys_0_nominal-background
shapes *           name1       2016/2016_shapes_df_input.root SR2-sys_0_nominal-$PROCESS SR2-$SYSTEMATIC-$PROCESS
shapes background  name1       2016/2016_shapes_df_input.root SR2-sys_0_nominal-background
----------------------------------------------------------------------------------------------------------------------------------
bin          name0  name1
observation  -1     -1   
----------------------------------------------------------------------------------------------------------------------------------
bin                                          name0       name0       name1       name1     
process                                      125         background  125         background
process                                      0           1           0           1         
rate                                         -1          -1          -1          -1        
----------------------------------------------------------------------------------------------------------------------------------
lumi                    lnN                  1.025       -           1.025       -         
sys_0.5_ISRFSR_2016_    shapeN2              1.0         -           1.0         -         
sys_0.5_L1_2016_        shapeN2              1.0         -           1.0         -         
sys_0.5_Muon_2016_      shapeN2              1.0         -           1.0         -         
sys_0.5_btag_2016_      shapeN2              1.0         -           1.0         -         
sys_0.5_elSF_2016_      shapeN2              1.0         -           1.0         -         
sys_0.5_jer_2016_       shapeN2              1.0         -           1.0         -         
sys_0.5_jes_2016_       shapeN2              1.0         -           1.0         -         
sys_0.5_pdf_2016_       shapeN2              1.0         -           1.0         -         
sys_0.5_pu_             shapeN2              1.0         -           1.0         -         
sys_0.5_puid_2016_      shapeN2              1.0         -           1.0         -         
sys_0.5_roch_2016_      shapeN2              1.0         -           1.0         -         
sys_0.5_trigger_2016_   shapeN2              1.0         -           1.0         -   
ajgilbert commented 1 year ago

This is a bug in the CH datacard parser. Because in text2workspace cards are accepted if the two process lines are swapped, we have some logic in the CH parser to try and guess which is which, by seeing in the entry is convertible to int or not. Unfortunately it tries 125 here, which is convertible, and it decides this must be the process index, and 0 on the line below must be the process name. Would be better if we assume the first line is the name in the case both are convertible to int. For now, I think you might be able to get it working by swapping these two lines in the card.