Possible bug when calculating SM covariance

MJKirk commented 6 years ago

In the _get_covariance_sm function (flavio/statistics/fits.py#L534), comparing https://github.com/flav-io/flavio/blob/2303b05b3584c0f6a9882ef15730316f8a1b6b49/flavio/statistics/fits.py#L544 (the code for multiple threads) to https://github.com/flav-io/flavio/blob/2303b05b3584c0f6a9882ef15730316f8a1b6b49/flavio/statistics/fits.py#L539 (code for a single thread) the get_predictions_array function is called with different arguments, since the default is True for all three parameters. In particular, it looks like it just uses the SM wilson coefficients if doing a single threaded calculation, but our Wilson coefficient function if using multiple threads.

DavidMStraub commented 6 years ago

Hi,

at first sight it actually seems OK, although admittedly very badly written and poorly commented.

For 1 thread, self._get_random(par=False, nuisance=True, wc=False) generates parameter sets with random nuisance parameters but central parameters and WCs. In get_predictions_array, setting the arguments seems redundant as the parameter sets already have central parameters and WCs.

For N threads, self._get_random_nuisance is called which does exactly the same, and setting the arguments on get_predictions_array is not necessary.

Or did you actually find a numerical difference between the two?

MJKirk commented 6 years ago

Some background - the reason I noticed something was wrong is that I was getting RuntimeWarnings appearing when I run make_measurement, which came from my wilson coefficient function being called with both of the arguments set to zero - I've got something like an NP contribution going as coupling / mass, and I was trying to do a fit in that plane.

Assuming this is something I should be able to do (?) - the issue is actually arising when the get_predictions function gets the SM prediction https://github.com/flav-io/flavio/blob/2303b05b3584c0f6a9882ef15730316f8a1b6b49/flavio/statistics/fits.py#L340 as my WC function gets passed down and so gets called with the arguments set to zero.

So I think we do still need to be calling get_predictions_array with wc=False...

DavidMStraub commented 6 years ago

OK, now I see the problem. It's that the wc_function does not return vanishing Wilson coefficients for vanishing arguments. This was probably assumed in some cases, since the wc_function was originally meant to map between different bases and conventions for WCs rather than actual model parameters and WCs. But I agree that in principle this should not be assumed. So it is indeed a bug, but will need some careful examination of other parts of the code as well.

MJKirk commented 6 years ago

Well, good to know this is a bug. Would I be right in thinking this could be worked around (at least for the moment) by using some "well-behaved" WC function to calculate the SM covariance and then save it, and then load that data when using my actual WC function?

DavidMStraub commented 6 years ago

Yes. "Well behaved" would in that case mean that it returns vanishing WCs for vanishing arguments.

peterstangl commented 6 years ago

Due to the changes introduced in PR #47, the fit_wc_function is not called with its arguments set to zero anymore when calculating the SM covariance on multiple threads. Consequently, the above described bug should be fixed by this PR.

However, before closing the issue, it might be useful to add a test with a fit_wc_function that raises an error when called to make sure that the SM covariance calculation does not depend on this function at all.

Furthermore, it might be necessary to check if there are other cases where fit_wc_function is called with its arguments set to zero.

DavidMStraub commented 6 years ago

However, before closing the issue, it might be useful to add a test with a fit_wc_function that raises an error when called to make sure that the SM covariance calculation does not depend on this function at all.

Good idea. I just pushed such a test to your PR branch.

Furthermore, it might be necessary to check if there are other cases where fit_wc_function is called with its arguments set to zero.

Actually this is currently unavoidable by design, e.g. for Bayesian fits, where the starting values for the arguments of the function are assumed to be close to 0. I have to think about it.

But as far as this issue is concerned, I think it can be closed when #47 is merged.

DavidMStraub commented 6 years ago

Sorry, had to revert that commit; the fit_wc_function is always called once on instantiation to make sure it is properly defined.

peterstangl commented 6 years ago

As a simple work around, is it possible to define the fit with a working fit_wc_function such that there is no error on instantiation and then to replace that function after instantiation by manipulating the FastFit instance?

DavidMStraub commented 6 years ago

Yes that would be possible I guess ...

DavidMStraub commented 6 years ago

Solved by #47.

flav-io / flavio

Possible bug when calculating SM covariance #42