scikit-hep / pyhf

pure-Python HistFactory implementation with tensors and autodiff
https://pyhf.readthedocs.io/
Apache License 2.0
283 stars 83 forks source link

implement staterror #85

Closed lukasheinrich closed 6 years ago

lukasheinrich commented 6 years ago

should be mostly straight forward but requires additional bookkeeping which samples participate. Essentially one additional constraint term per bin

lukasheinrich commented 6 years ago

this is the relevant section..

screenshot
kratsg commented 6 years ago

Need to essentially add a Poisson() term for each channel in the constraint PDF (around here: https://github.com/diana-hep/pyhf/blob/master/pyhf/__init__.py#L394). This would be useful to get the MBJ analysis up and running in HistFactory.

lukasheinrich commented 6 years ago

for basic mechanics of doing 2D contour check cell [48]

https://github.com/diana-hep/pyhf/blob/master/examples/notebooks/multiBinPois.ipynb

lukasheinrich commented 6 years ago

i.e. do add some more info

the general HiFa setup is:

each sample can declare whether or not it participates in stat. flucuations. e.g. in the JSON we could have

name: 'some sample'
mods: [
  {'type': 'staterror', data: {'enable': true}},
]

this is a mod that will be shared across all enabled samples in the channel

for each bin in the channel, you go through the samples add up all the mean rates of the samples that do participate. the mechanics of the auxiliary measurement is the essentially the same like the shapesys here

https://github.com/diana-hep/pyhf/blob/master/pyhf/__init__.py#L142

so for each channel one gets 1 Pois() term per bin (one can use broadcasting to compute this in one go using the tensorlib backends)

ill add more info later

tentative itemized work

[ ] extend spec so sample can declare participation in staterror [ ] split up the bin-broadcasted sum in https://github.com/diana-hep/pyhf/blob/master/pyhf/__init__.py#L368 so that it moves from sum(samples) to sum(stack(disabled_stat_samples), axis=0) + gamma_stat_i * sum(stack(enabled_sum_samples), axis=0) [ ] add constraint pdf for the gamma_stat_i e.g. Pois(gamma_stat_1)*Pois(gamma_stat_2)...