Open MarcelHoh opened 1 year ago
Hi @MarcelHoh, that sounds like a useful addition. We could add a minimizer
(or similar) keyword argument to limit
, ranking
, scan
, significance
and then allow switching to scipy
for the cases where parameter uncertainties are not strictly needed. There are currently internally two ways of doing the minimization (depending on the custom_fit
setting), either going through pyhf
or directly using iminuit
. This setting has historically been useful but probably has limited use now, so it might be a good time to drop this from ranking
and scan
and thereby avoid having to handle the case where a scipy
minimizer wrapping would also have to be added to cabinetry
.
I can give it a go. To add some data, with scipy a test fit is taking me 40ms to 12s with minuit. For a ranking with ~190 parameters, as I'm using for a test, this ends up being 6 minutes (tested) with scipy to ~3+ hours with minuit (estimated).
We could add a minimizer (or similar) keyword argument to limit, ranking, scan, significance and then allow switching to scipy for the cases where parameter uncertainties are not strictly needed.
I am not familiar with the code (this is my first real look at it) but I would naively lean towards adding a force_minuit
option to _fit_model
and only setting the backend to minuit when this is true. This would then let users choose between scipy and minuit if they wanted to. On the other hand, I cannot currently think of a reason to not use scipy when the uncertainties are not needed, so it might make sense to force this.
That's an impressively big difference in runtime between minimizers! I don't think I have seen one that is quite this striking before.
I would prefer to stick with Minuit as the default minimizer regardless. It is often slower and can be more fragile, but I have seen a number of cases where the scipy minimizer appears to have converged but returns nonsensical results. I have a lot more confidence in the minimum being accurate when determined via Minuit and think the safe but slow choice is therefore the better default. This is a place where cabinetry
is somewhat opinionated, switching the default from what pyhf
uses and making iminuit
a core dependency.
In my experience using scipy as the minimiser is significantly faster than minuit. This can be very useful when the uncertainty of each parameter is not needed such as limit setting and ranking parameters.
Within cabinetry the
cabinetry.fit.._fit_model
function always overrides the set minimizer in pyhf to be minuit. This makes the ranking and limits quite a bit slower than they need to be. For the ranking only the first fit, with no fixed nuisance parameters, should need to use minuit as the optimiser.Would it be possible to make the default switch to minuit an option in
cabinetry.fit._fit_model
? This would allow for always using minuit if the user desires but also to switch to scipy for all cases where the uncertainty is not needed.