scikit-hep / iminuit

Jupyter-friendly Python interface for C++ MINUIT2
https://scikit-hep.org/iminuit
Other
285 stars 76 forks source link

Add beginner-friendly tutorial for use of cost functions #622

Closed jvavrek closed 3 years ago

jvavrek commented 3 years ago

I recall in C++ ROOT there was a TH1::Fit() flag where one could easily switch between a chi2 and a Poisson likelihood for fitting a histogram. Does something like that exist in iminuit? I scoured the documentation, especially for iminuit.cost, but the keyword "Poisson" is not present.

iminuit.cost.BinnedNLL looks relevant, but then do I need to provide the Poisson cdf manually? If so, a PoissonLoss class would be very useful to build in.

Or maybe I could just change the loss kwarg in iminuit.cost.LeastSquares?

HDembinski commented 3 years ago

To fit a Poisson likelihood you can use one of the BinnedNLL or ExtendedBinnedNLL cost functions, both are likelihoods for histograms with Poisson-distributed counts. You don't need to provide the Poisson cdf. The cdf you need to provide is the model of your data distribution over the histogram. So let's say you want to fit a Gaussian peak over some range in x, then you pass the Gaussian cdf.

HDembinski commented 3 years ago

This tutorial is not very beginner friendly: https://github.com/scikit-hep/iminuit/blob/develop/tutorial/cost_functions.ipynb I should add one that shows how to use the builtin cost functions for a typical fit.

jvavrek commented 3 years ago

Thanks. A beginner-friendly tutorial that fits a peak and background and compares the chi2 vs Poisson likelihood results would be a very helpful addition.

In my own case, I made a custom cost function that computes the Poisson loss, and passed that to Minuit, e.g.:

def poisson_loss(y_eval, y_data):
    return np.sum(y_eval - scipy.special.xlogy(y_data, y_eval))

def model_loss(*args):
    """evaluate poisson loss of the given model"""
    ...

m = Minuit(model_loss, variables, **guesses)
m.errordef = Minuit.LIKELIHOOD
m.migrad()
m.hesse()
m.minos()

I compared it against some Poisson-loss lmfit code and got almost identical results, save for a few cases I will pin down to poor convergence due to bad initialization.

HDembinski commented 3 years ago

This looks reasonable. The use of scipy.special.xlogy is a good idea.