scikit-hep / pyhf

pure-Python HistFactory implementation with tensors and autodiff
https://pyhf.readthedocs.io/
Apache License 2.0
285 stars 84 forks source link

ATLAS-CONF-2018-041 is not validated #217

Open kratsg opened 6 years ago

kratsg commented 6 years ago

Description

This is the bug report to keep track of the validation attempt for ATLAS-CONF-2018-041. The goal is to have pyhf calculate the same expected and observed CLs that is reported by HistFitter -- which was used to perform the statistical fitting in the CONF.

To do so, we are validating a single signal point that the exclusion fit was performed on: Gtt_2400_5000_800 (gluino mass = 2.4 TeV, neutralino mass = 800 GeV, and stop mass is integrated out by setting to 5 TeV). The workspace name for this is 3b_tag21.2.27-1_RW_ExpSyst_79800_multibin_excl_Gtt_2400_5000_800 and all the necessary XML/ROOT/JSON files are in the branch pyhf@validation/atlas-conf-2018-041.

From HistFitter, we have the following CLs from running run_cls.py (with no lumi uncertainty enabled):

{
    "CLs_exp": [
        0.07408094,
        0.1682635,
        0.3514716,
        0.6266292,
        0.8772564
    ],
    "CLs_obs": 0.1938868,
}

From pyhf, we have the following:

{
    "CLs_exp": [
        0.10241684820963048,
        0.21210550275772358,
        0.4067405266774231,
        0.6746761281349481,
        0.8995358465168168
    ],
    "CLs_obs": 0.27743067969367186
}

Expected Behavior

We expect pyhf to match what HistFitter reports.

Actual Behavior

pyhf does not match the HistFitter CLs values.

Steps to Reproduce

This uses the branch, run_cls.py, and pyhf xml2json. All of this is used in combination with hist2workspace to generate all the necessary files and calculate the CLs.

Checklist

kratsg commented 6 years ago

One current issue is the lack of a lumi uncertainty implemented (pyhf does not implement it currently).

kratsg commented 6 years ago

The procedure is to now remove systematics one by one, until we can get the two matching. In the XML, this means for example, commenting out all HistoSys in the XML and then re-running pyhf xml2json | pyhf cls or something similar.

lukasheinrich commented 6 years ago

this might actually be a somewhat opportune moment to introduce some "pdf pruning" commands.. e.g. tools to convert a pdf -> pdf' via transforms that e.g. removing all modifiers of type x

kratsg commented 6 years ago

First step: I'm removing all but 3 channels (Hnj-Hmeff, SR0L, SR1L, CR). Then I'm going to re-run to compare pyhf and HistFitter.

kratsg commented 6 years ago

Steps

time pyhf cls 3b_tag21.2.27-1_RW_ExpSyst_79800_multibin_excl_Gtt_2400_5000_800.json


## Results

3 channels (Hnj-Hmeff, SR0L, SR1L, CR)

HistFitter

```json
{
    "CLs_exp": [
        0.08699926516052142, 
        0.18880753135366438, 
        0.3780738391364131, 
        0.6503680223992481, 
        0.8885483255874077
    ], 
    "CLs_obs": 0.2669619554691735
}

pyhf:

{
    "CLs_exp": [
        0.09476105998889642, 
        0.20068731675458956, 
        0.39287657527882835, 
        0.663077243608068, 
        0.894364001610304
    ], 
    "CLs_obs": 0.29858205248651437
}
kratsg commented 6 years ago

Results

1 channel (Hnj-Hmeff, SR0L)

HistFitter

{
    "CLs_exp": [
        0.9896507294741269, 
        0.993334526794282, 
        0.9965052715158643, 
        0.9987375515074569, 
        0.9997569517725552
    ], 
    "CLs_obs": 0.9964967853017466
}

pyhf:

{
    "CLs_exp": [
        0.9933216606116845, 
        0.9957021423543831, 
        0.9977483734211297, 
        0.9991872479456055, 
        0.9998436499800815
    ], 
    "CLs_obs": 0.9977483521679603
}
kratsg commented 6 years ago

Results

2 channel (Hnj-Hmeff, SR0L, SR1L)

HistFitter

{
    "CLs_exp": [
        0.1391943326109654, 
        0.2635581673552643, 
        0.4653884764059395, 
        0.720773263353181, 
        0.9188614054149957
    ], 
    "CLs_obs": 0.3354334675660509
}

pyhf:

{
    "CLs_exp": [
        0.1447115475828421, 
        0.2708683111858936, 
        0.47327341680621826, 
        0.7266332139608659, 
        0.9211823957371932
    ], 
    "CLs_obs": 0.3713206584902179
}
lukasheinrich commented 6 years ago

can you post the JSONs?

kratsg commented 6 years ago

2-channel JSON:

https://gist.github.com/1008e972e0d0630d158198224721c652

1-channel JSON:

https://gist.github.com/kratsg/e4cde7d195656f769584ae73c51c0379