Open takanesano opened 1 year ago
par_bounds[model.config.poi_index] = (0,500000)
you might want to rescale your signal and background so your POI bounds aren't as large as they are. These sorts of fits breakdown if you're trying to do larger orders of magnitude.
Thank you for the suggestion. Can you teach me more in detail how to rescale sig/bkg to avoid this? Does it simply mean multiplying some small number to all sig, bkg, data?
Can you teach me more in detail how to rescale sig/bkg to avoid this?
@takanesano I haven't had time to look at this Issue in full yet, but to address this last question now, while I think(?) we have a GitHub Discussion somewhere on this here's at least an old example from Stack Overflow (from before we transitioned questions to GitHub Discussions): Fit convergence failure in pyhf for small signal model (which was also Issue #762) . The version of pyhf
used in that example is old and so the APIs differ from v0.7.x
but I think it demonstrates what you need.
@matthewfeickert Thank you so much for the information!
Rescaling the signal with a factor of 1e-2 worked for the case of sig1
.
My next problem is when trying to find a scale factor that makes it work, the window of capable factors is too small.
The codes below are the example.
import numpy as np
import pyhf
from pyhf.contrib.viz import brazil
bkg = [11153054.0, 5122485.5, 1612950.8, 623655.0, 288350.78, 133780.98, 68429.08, 29384.553, 16960.21, 7732.061]
bkg_uncertainty = [2230610.8, 1024497.1, 322590.16, 124731.0, 57670.156, 26756.197, 13685.815, 5876.9106, 3392.0422, 1546.4122]
sig3 = [887.22363, 4129.3247, 9386.207, 6953.6787, 2835.5503, 1102.8623, 437.77795, 196.58675, 81.46747, 30.449577]
pyhf.set_backend("numpy")
rescale = 5.
sig = list(np.array(sig1)*rescale)
model = pyhf.simplemodels.uncorrelated_background(
signal=sig3, bkg=bkg, bkg_uncertainty=bkg_uncertainty
)
par_bounds = model.config.suggested_bounds()
par_bounds[model.config.poi_index] = (0,100)
init_pars = model.config.suggested_init()
init_pars[model.config.poi_index] = 0.
data = pyhf.tensorlib.astensor(bkg + model.config.auxdata)
obs_limit, exp_limit_and_bands = pyhf.infer.intervals.upper_limits.upper_limit( data, model, par_bounds=par_bounds, init_pars=init_pars )
This returns
ValueError: Invalid function value: f(10.000000) -> nan
As Fit convergence failure in pyhf for small signal model suggests, I've checked CLs for µ=1 with the scale factor I'm using here.
Observed CLs for µ=1: 0.5915001216
-----
Expected (-2 σ) CLs for µ=1: 0.246
Expected (-1 σ) CLs for µ=1: 0.392
Expected CLs for µ=1: 0.592
Expected (1 σ) CLs for µ=1: 0.806
Expected (2 σ) CLs for µ=1: 0.950
I think this is a reasonable value but I couldn't find any working scale factors while scanning for rescale in range(1., 10., 1.)
.
This is an interesting example where minimization can fail for what looks like a rather simple model. I simplified this a bit further down to two bins:
import numpy as np
import pyhf
bkg = [5000000.0, 1612950.8]
bkg_uncertainty = [1000000.1, 322590.16]
sig = [4000.0, 9386.207]
model = pyhf.simplemodels.uncorrelated_background(
signal=sig, bkg=bkg, bkg_uncertainty=bkg_uncertainty
)
par_bounds = model.config.suggested_bounds()
par_bounds[model.config.poi_index] = (0, 100)
data = bkg + model.config.auxdata
# hypotest
res = pyhf.infer.hypotest(11.843640283180768, data, model, par_bounds=par_bounds)
print(np.isnan(res)) # NaN with scipy, Minuit gives non-NaN
res = pyhf.infer.hypotest(11.843640, data, model, par_bounds=par_bounds)
print(np.isnan(res)) # not NaN with scipy
# maximum likelihood estimates
pyhf.set_backend("numpy", "minuit")
# default Minuit is far off from the minimum (POI=0 is the MLE here)
print(pyhf.infer.mle.fit(data, model, par_bounds=par_bounds))
# tuning helps, but cannot lower tolerance further (cannot reach EDM threshold)
print(pyhf.infer.mle.fit(data, model, par_bounds=par_bounds, tolerance=1e-5, strategy=2))
output:
[...]/pyhf/src/pyhf/infer/calculators.py:467: RuntimeWarning: invalid value encountered in divide
CLs = tensorlib.astensor(CLsb / CLb)
True
False
[0.62423955 0.99950048 0.99636748]
[0.03494499 0.99997209 0.99979678]
Some observations:
hypotest
failing with 11.843640283180768
and succeeding with 11.843640
when using the scipy optimizer is very surprising. The Minuit backend does not return NaN for either of these two settings.mu=0
) even with Minuit, and even with tuning the tolerance and strategy a bit. Lower tolerances result in failed fits (desired EDM threshold cannot be reached).Different initial parameter settings might help here, I imagine the difficulty might be related to the mu=1
initial setting already providing a very good fit to begin with. Another issue is that the best-fit value for mu
is at a boundary, relaxing the boundary also helps a bit but does not seem sufficient either.
From some further experimentation I think the main problem is that the POI bounds are too small given the sensitivity and that the default tolerance is too high. The attached example sets the bounds to [-500, 500]
and the Hessian estimate is still very bad. This is using Minuit and even with strategy=2
I cannot push the tolerance low enough to reliably get convergence depending on the initial parameter settings (the scan is easier since the POI is fixed).
Summary
When I try to get upper_limit, toms748 returns the error
FailedMinimization: Inequality constraints incompatible
orValueError: Invalid function value: f(9.500000) -> nan
.OS / Environment
Steps to Reproduce
File Upload (optional)
No response
Expected Results
Expected to go through and return obs_limit.
Actual Results
pyhf Version
Code of Conduct