Minuit.hesse() doesn't compute parameter errors (and there is no warning)

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?
1. Run the attached script bb_minuit.py (requires bb_utils.py, which is also 
attached)
2. You can also run bb_scipy.py if you like (also requires bb_utils.py), which 
gives correct results.

The correct errors that hesse() should produce are
bb_utils.BoxBOD.perr = np.array([1.2354515176e+01, 1.0455993237e-01])

The output I get is attached as bb_minuit_output.txt

What version of the product are you using? On what operating system?
pyminuit 1.2.0 linked against SEAL-Minuit 1.7.9 with python 2.7 on Mac OS X 
10.7 Lion

As I already suggested in ticket #30, it would be nice if some unit tests for 
hesse() and minos()
were included in pyminuit.
At the moment I don't know if I am using bad parameters for hesse() or if my 
version of pyminuit is broken.

Original issue reported on code.google.com by Deil.Chr...@googlemail.com on 29 Aug 2011 at 11:18

Attachments:

GoogleCodeExporter commented 9 years ago

This is a question about the behavior of Minuit, not the PyMinuit wrapper, but 
I'll say as much as I know about it.

The MIGRAD subroutine usually finds the minimum and the error matrix in one go; 
the HESSE subroutine is only provided so that you can be certain.  The normal 
operation is for functions to just use the MIGRAD subroutine, but I encountered 
a case in my thesis where MIGRAD converged on a minimum but not an error 
matrix--- hence, I always call HESSE afterward and recommend it.

As for the uncertainties it returns, Minuit gives

>>> m.errors
{'b1': 0.7732838788874276, 'b2': 0.00689659943153219}

by MIGRAD/HESSE and

>>> m.merrors
{('b1', -1.0): -0.7709548489058111, ('b1', 1.0): 0.7757043918606901, ('b2', 
1.0): 0.006947923990092301, ('b2', -1.0): -0.006845982275713673}

by MINOS.  These are different algorithms (MIGRAD/HESSE is a numerical second 
derivative, MINOS crawls up the chi2 function, looking for an intercept), which 
gives some confidence that they're right.

I additionally checked it by hand/eye, by plotting the chi2 function near its 
minimum plus-and-minus one sigma as calculated by Minuit:

>>> from cassius import *     # http://code.google.com/p/cassius/
>>> view(Curve(lambda x: chi2(x, 0.547), xmin=213.8-0.773, xmax=213.8+0.773))
>>> view(Curve(lambda x: chi2(213.8, x), xmin=0.547-0.0069, xmax=0.547+0.0069))

and indeed the chi2 function increases by about one unit over this distance 
scale in b1 and b2.  The much larger values, 12.35451518 and 0.10455993 (from 
NIST?) can't be right for this function and dataset.

> As I already suggested in ticket #30, it would be nice if some unit tests for 
hesse() and minos()
were included in pyminuit.

That would be nice, though PyMinuit is a small project and is rarely updated.  
(It is a generalization of something I found useful in my thesis five years 
ago, a wrapper around a package that hasn't substantially changed since the 
1970's.  Also, it's a version of Minuit that hasn't been updated since 2006.)

If you're interested in writing a set of unit tests, I can give you developer 
permissions.  (I just asked Rob Franke if he'd like to commit his Python3 port, 
and that's something that could be good to test.)

Best,
-- Jim

Original comment by jpivar...@gmail.com on 30 Aug 2011 at 2:47

GoogleCodeExporter commented 9 years ago

Jim, 

thanks for your quick reply and comments.
I am debugging this at the moment and realized that I do get the correct errors 
from NIST if I set:
m.up = 2 ** 8

I.e. I am simply missing a factor in the definition of my chi2 function, 
although I haven't figured out yet what is missing yet.
Please let me know if you have an idea (I have 6 data points and 2 parameters, 
do I have to divide by the number of degrees of freedom somewhere?)

--------------

If you give me permissions I'd be happy to add the hesse(), minos() etc. 
examples from the GettingStarted wiki page as unit tests.
Given that MINUIT is so old and production-tested I think it's basically enough 
to hit every method in the pyminuit wrapper once just so that the user can 
easily be sure his installation is working.

One thing I don't know how to do is add a minuit.test() method directly, since 
there is no python wrapper, just a C wrapper, right?

Christoph

Original comment by Deil.Chr...@googlemail.com on 30 Aug 2011 at 3:04

GoogleCodeExporter commented 9 years ago

> I.e. I am simply missing a factor in the definition of my chi2 function, 
although I haven't figured out yet what is missing yet.
Please let me know if you have an idea (I have 6 data points and 2 parameters, 
do I have to divide by the number of degrees of freedom somewhere?)

A normal chi^2 definition goes like this:

chi^2 = sum_i (x_i - f(x_i))**2 / err_x_i**2

and your function doesn't have explicit err_x_i's (so they're implicitly 1.0).  
The number of degrees of freedom (6-2 = 4 in this case) is only needed for 
calculating the chi^2 probability (1 - incomplete_gamma(ndf/2, chi2/2)).

> One thing I don't know how to do is add a minuit.test() method directly, 
since there is no python wrapper, just a C wrapper, right?

Yes, this package only has a C(++) module, no pure Python anywhere.  The unit 
tests could be a pure Python submodule in the same package.  If it's added to 
setup.py, a user could call them like this:

>>> import minuit   # to get the package itself
>>> import minuittests
>>> minuittests.test()

-- JIm

Original comment by jpivar...@gmail.com on 30 Aug 2011 at 3:33

GoogleCodeExporter commented 9 years ago

I'm behind the times on my own project: try checking out revision r93 from SVN. 
 It has a test/ directory containing test_minuit.py.  Is this what you were 
looking for?

Original comment by jpivar...@gmail.com on 30 Aug 2011 at 3:54

GoogleCodeExporter commented 9 years ago

> I'm behind the times on my own project: try checking out revision r93 from 
SVN.  It has a test/ directory containing test_minuit.py.  Is this what you 
were looking for?

r93 is now pyminuit-1.2.1, so you can check it out as a package.  It has a 
suite of tests, though they are not in setup.py for users to run.  (This was 
all done by Rob Franke.)  If you're still interested in making changes to the 
tests (e.g. additional tests that Rob didn't implement), I can make you a 
developer so that you can do that.

Thanks,
-- JIm

Original comment by jpivar...@gmail.com on 30 Aug 2011 at 4:28

GoogleCodeExporter commented 9 years ago

I have a few tests I'd like to add to test/test_minuit.py, 
which basically test the examples and FunctionReference shown on the pyminuit 
web page.
All tests pass for me on Mac OS X 10.7 with python 2.5 to 3.2 (I don't have 
python2.4). :-)

It would be nice if there was a python wrapper around minuit.so so that it 
becomes possible to have tests and version info installed with minuit.so:
>>> import minuit
>>> minuit.test() # Simple executes what is now test/test_minuit.py
>>> minuit.__version__

I think this just requires an __init__.py with "from lib_minuit import *" and a 
few changes to setup.py,
but I don't have any experience so I don't want to mess with installation.

I was confused for a minute because svn trunk has version="1.1.2" in setup.py 
(should be changed to "HEAD"?) and the installed egg-info. I already was at r93.

Original comment by Deil.Chr...@googlemail.com on 31 Aug 2011 at 12:08

GoogleCodeExporter commented 9 years ago

I've looked into this issue some more and found out that apparently the 
covariance matrix
returned by HESSE has to be multiplied by a factor s_sq = chi**2 / (degrees of 
freedom) to get correct errors.

Correct in the sense that they match the NIST StRD reference and what other 
fitting packages give.

So there is nothing wrong with pyminuit, this is a question about the 
interpretation of the results
returned by MINUIT.

Please have a look at the discussion I started here in the hope that someone 
could explain to me why this factor s_sq needs to be applied:
http://mail.scipy.org/pipermail/scipy-user/2011-August/030420.html

My current understanding is that the error definition in the MINUIT user guide 
and the errors given in the fitting example in the pyminuit GettingStarted 
guide is incorrect, because they do not contain this factor:
>>> m.errors
{'a': 0.12247448677828, 'b': 0.045790546904858835}

The correct result should be
perr: [ 0.33828315  0.12647671]

which you can get by running the attached script chi2_example.py.

Or more likely I am misunderstanding something fundamental.

Thoughts?

Original comment by Deil.Chr...@googlemail.com on 31 Aug 2011 at 3:24

Attachments:

chi2_example.py

GoogleCodeExporter commented 9 years ago

It looks to me like an "S-factor", a factor which is multiplied to an 
uncertainty to account for internal disagreement in the data.  The way that 
I've seen it in physics (combining experimental results into a world average) 
is like this:

If the chi^2/ndf of a collection of experiments measuring the same thing is 
less than 1, report only the combined uncertainty;
if greater than one, report the combined uncertainty times chi^2/ndf.

It quantifies disagreement among data points as an additional uncertainty, and 
different people have different interpretations about whether or not it's a 
good thing to do.  From the standpoint of a program that produces statistical 
output, however, PyMinuit should do the bare minimum and allow the user to 
multiply the output by whatever he or she things is appropriate.

Original comment by jpivar...@gmail.com on 31 Aug 2011 at 3:48

GoogleCodeExporter commented 9 years ago

Jim, thanks for your help figuring this one out.
You can close this ticket as invalid, there is no problem with Minuit or 
pyminuit.

For future reference, the short summary of this discussion is this:

Other fitting packages (e.g. scipy.optimize.curve_fit) will give you different 
parameter errors from what you get with a simple chi^2 fit as e.g. implemented 
in the Fitting example of the GettingStartedGuide on the pyminuit wiki pages. 
They will automatically multiply the covariance matrix by s_sq = chi^2/ndf, 
i.e. the errors are multiplied by s = sqrt(s_sq).

As far as I know most published physics or astronomy results don't apply this 
factor, whereas apparently it seems to be applied most of the time e.g. in 
statistics and finance. Also the NIST Statistical Reference Dataset fit 
solutions at http://www.itl.nist.gov/div898/strd/general/dataarchive.html have 
this factor applied, i.e. you can reproduce them with MIGRAD and HESSE by 
computing the covariance matrix with a chi^2 fit with y_error = 1 on the data 
and then multiplying the covariance matrix by s_sq after the fit.

Basically that means that for every published result or fitting package you use 
you have to look out for this s factor.

Original comment by Deil.Chr...@googlemail.com on 1 Sep 2011 at 9:16

GoogleCodeExporter commented 9 years ago

Here is the additional unit tests I had mentioned before. You can add them to 
test_minuit.py

class TestExample(unittest.TestCase):
    """Run the example from the pyminuit page as unit tests:
    http://code.google.com/p/pyminuit/"""

    def setUp(self):
        self.m = minuit.Minuit(f, x=10, y=10)

    def test(self):
        self.migrad()
        self.hesse()
        self.minos()
        self.contour()
        self.scan()

    def migrad(self):
        self.m.migrad()
        self.assertAlmostEqual(self.m.fval, 0, places=5)
        self.assertAlmostEqual(self.m.ncalls, 72)
        self.assertAlmostEqual(self.m.edm, 0, places=5)
        self.assertAlmostEqual(self.m.values['x'], 2, places=2)
        self.assertAlmostEqual(self.m.values['y'], 0, places=2)

    def hesse(self):
        self.m.hesse()
        self.assertAlmostEqual(self.m.ncalls, 19)
        self.assertAlmostEqual(self.m.errors['x'], 3, places=5)
        self.assertAlmostEqual(self.m.errors['y'], 1, places=5)
        c = self.m.covariance
        self.assertAlmostEqual(c[('x','x')], 9, places=3)
        self.assertAlmostEqual(c[('x','y')], 0)
        self.assertAlmostEqual(c[('y','x')], 0)
        self.assertAlmostEqual(c[('y','y')], 1, places=3)
        c = self.m.matrix(correlation=True)
        self.assertAlmostEqual(c[0][0], 1)
        self.assertAlmostEqual(c[0][1], 0)
        self.assertAlmostEqual(c[1][0], 0)
        self.assertAlmostEqual(c[1][1], 1)

    def minos(self):
        self.m.minos()
        self.assertAlmostEqual(self.m.ncalls, 18)
        e = self.m.merrors
        self.assertAlmostEqual(e[('x', -1.0)], -3, places=2)
        self.assertAlmostEqual(e[('x',  1.0)],  3, places=2)
        self.assertAlmostEqual(e[('y', -1.0)], -0.7858, places=2)
        self.assertAlmostEqual(e[('y',  1.0)],  0.7858, places=2)

    def contour(self):
        c = self.m.contour("x", "y", 1.)
        self.assertAlmostEqual(c[0][0], -1, places=2)
        self.assertAlmostEqual(c[0][1], 0, places=2)
        self.assertAlmostEqual(c[1][0], -0.7700, places=2)
        self.assertAlmostEqual(c[1][1], -0.3592, places=2)
        self.assertAlmostEqual(c[2][0], -0.3815, places=2)
        self.assertAlmostEqual(c[2][1], -0.5354, places=2)

    def scan(self):
        s = self.m.scan(("x", 5, 0, 10), ("y", 5, 0, 10), corners=True)

class TestFunctionReference(unittest.TestCase):
    """Test the things that are not covered by TestExample,
    but are mentioned in the FunctionReference 
    to make sure they really exist and work:
    http://code.google.com/p/pyminuit/wiki/FunctionReference"""

    def setUp(self):
        self.m = minuit.Minuit(f, x=10, y=10)

    def test(self):
        minuit.machine_precision()
        # Minuit methods
        self.m.simplex()
        # Minuit members
        self.m.fcn
        p = self.m.parameters
        self.assertEqual(p, ('x', 'y'))
        v = self.m.values
        self.assertEqual(self.m.args, (v['x'], v['y']))
        # Minuit user settings
        f = self.m.fixed
        f['y'] = True
        self.assertEqual(f['x'], False)
        self.assertEqual(f['y'], True)
        l = self.m.limits
        l['y'] = (42, 43)
        self.assertEqual(l['x'], None) 
        self.assertEqual(l['y'][0], 42) 
        self.assertEqual(l['y'][1], 43) 
        self.m.maxcalls = 42
        self.assertEqual(self.m.maxcalls, 42)
        self.m.tol = 4.2
        self.assertEqual(self.m.tol, 4.2)
        self.m.strategy = 2
        self.assertEqual(self.m.strategy, 2)
        self.m.up = 4
        self.assertEqual(self.m.up, 4)
        self.m.printMode = 3
        self.assertEqual(self.m.printMode, 3)

Original comment by Deil.Chr...@googlemail.com on 1 Sep 2011 at 9:18

jpivarski / pyminuit

Minuit.hesse() doesn't compute parameter errors (and there is no warning) #31