Open klemens90 opened 1 year ago
@klemens90 you haven't provided any model or a minimal reproducible example. Is there a reason for this? I'm not sure how we can help you if you don't provide more information.
Sorry for that. I will drop here the code I use, which is copied partially from the pyhf examples and then a bit arranged to my needs. It can be executed just with python3 brazilianPlot.py. It generates 1000 toys in 50 steps, prints the UL plot and the profile likelihood plots. In order to make it work with the pyhf code, the change in pyhf/src/pyhf/infer/init.py I mentioned above has to be made. I don't think this is the correct way, since I had to change the original code. When trying to access the signal+background and background only distributions afterward, it failed with "Inequality constraints incompatible", so I could not plot the profile likelihood ratios. If I write out the distributions on the fly, it works. Thanks for having a look! minimum_example.tar.gz
Hi @klemens90, the message about incompatible inequality constraints indicates fit issues. I had a brief look at the model you defined and noticed that your signal yield is very large (100000.0
) compared to your observed counts (1.0
) and also the amount of background you have defined (0.3
). I would generally recommend free-floating parameters to be of O(1), while in this case the signal normalization will be of O(1e-6). This is not necessarily an issue, but if you do see fit issues it may be worth re-defining the normalization such that you get away from a regime where floating point issues might appear.
Summary
Dear authors,
we are using the pyhf library to compute upper limits with Toys. For validation, we are checking the distributions of the signal+background and background only hypothesis. At the POI of the computed UL we check, whether the CLs calculated by the fraction of their integrals, is around 0.1. This is the case, but we observe a strange bimodal distribution, which we did not expect to see. An example you can find in the below plot.
To generate these plot, we return
sig_plus_bkg_distribution
andbkg_only_distribution
which are defined in the pyhf source code herehttps://github.com/scikit-hep/pyhf/blob/a4ec176d07a7484c2e4829ed69f08ad3bc3e6af6/src/pyhf/infer/__init__.py#L170
. This code is changed in a way, that it returns the two distributions mentioned above below L202 in the same file. At the end of calculation, we use the
samples
function of objects to retrieve the distributions shown below. The value of the test statistics (green line in the plot) is retrieved from the calculator objecthttps://github.com/scikit-hep/pyhf/blob/a4ec176d07a7484c2e4829ed69f08ad3bc3e6af6/src/pyhf/infer/__init__.py#L159
with
calc.teststatistic()
. It defines the point from which we integrate the distribution until infinity, to get the areas in blue and orange.Do you have an idea, why these distributions have a bimodal shape? Is the approach how to retrieve the signal+background and background samples ok, or is there another way how to get these w/o editing the pyhf code itself?
OS / Environment
Steps to Reproduce
File Upload (optional)
No response
Expected Results
I would expect a more smooth distribution, like shown in the plots below, from another analysis. CLs performed with ROOT.
Actual Results
Instead, the distribution as shown in the problem description is observed.
pyhf Version
Code of Conduct