zfit / zfit-development

The developement repository for zfit with roadmaps, internal docs etc to clean up the issues
0 stars 2 forks source link

Add Hesse covariance matrix to FitResult #41

Closed marinang closed 5 years ago

marinang commented 5 years ago

Currently if one minimises loss function and call hesse on the fit result,

nll = ... 
minimizer = MinuitMinimizer()
result = minimizer.minimize(loss=nll)
result.hesse()

a way to get the covariance matrix is

result._minimizer._minuit_minimizer.covariance() # Covariance matrix (dict (name1, name2) -> covariance).
#or 
result._minimizer._minuit_minimizer.np_covariance() # Covariance matrix in numpy array format.

there are also methods to get either the covariance/correlation matrix called matrix or np_matrix, see iminuit API https://iminuit.readthedocs.io/en/latest/api.html .

So this should be accessible directly through zfit FitResult, either like iminuit with a covariance and matrix method, or covariance and correlations methods ? This should takes a param argument to get a matrix for any parameters. Also about the output format maybe this can be specified in the same way as you can specify an output format in uproot through outputtype argument in iterate for instance. A default output type could be a dictionary.

So something like this

def covariance(self, params: ParamsTypeOpt = None, method: Union[str, Callable] = 'minuit_hesse',
              outputtype: ... ) -> ...:
jonas-eschle commented 5 years ago

Yes, this sounds reasonable. Do we need a "correlation" method? It's the standardized covariance, and "correlation" here would imply the "linear correlation", right? Because there are quite a few different ways to define correlation. But covariance is well defined.

So yes, a covariance function seems good. I agree, the outputtype should here be changeable (or let's say we need both probably). What do you think isn't as a matrix the 90% usecase anyway? And as a dict the rare but still used one? So we may just add a flag as_dict=False?

Method probably to "None" and specifying in the docs what's the default.

marinang commented 5 years ago

Coming back at this. What should be the order the of parameter, aka. the columns and rows names of the matrix, should it be returned? Or should we see in the docs that if not precised, its's FitResult.params ?

Ok for the as_dict flag, do we agree that otherwise a Numpy array is returned for the matrix?

For the method, actually there is no need to call Hesse to get the covariance, it's just not exact probably. It gets updated after a Hesse call.

marinang commented 5 years ago

I don't tests for the FitResult class, did I miss them?