Validate phase space generator and bin flip fitter

MannyMoo commented 5 years ago

Just using a simple configuration of the resonance model, verify that we get unbiased pulls (mean consistent with zero, width consistent with one) for Re(z), Im(z), Re(deltaz) and Im(deltaz).

Currently we're using minimize from scipy.optimize in the default configuration. With an integrator precision of 1e-3 and the correct calculation of Poisson errors in the ratio histograms, the pull means look fairly unbiased, but the widths are > 1.

Things to try are:

Use 3e-4 integrator precision, generate 100 datasets, run the fits, and examine the pulls.
Try different methods in the arguments to minimize, see here. First try explicitly setting method = SLSQP, which should work for least-squares fitting. It's not clear what the other methods do from the doc, but try googling some to see if any seem appropriate and try them out. For each, just run the fits on the same set of generated datasets and examine the pull plots to find which works best.
Try requesting a smaller tolerance on the optimisation. Eg, from the doc for SLSQP, the tolerance is controlled by an ftol argument, the default of which is 1e-6. So try ftol = 1e-7 and see if the pull distributions improve.
It may be that minimize isn't powerful enough for our purposes, so we could try fitting using MINT, which will require writing a C++ class that calculates the chi-squared.

Try these in sequence, and if we find one that gives unbiased pulls, we can stop.

NiallMcHugh commented 5 years ago

I've created pulls for 3e-4 precision, but they don't look much better than for 1e-3. I've attached all four plots for reference.

1e-3, original chi2 pull_prec=1e-3

1e-3, poisson errors (new chi2) pull_prec=1e-3_poisson

3e-4, original chi2 pull_prec=3e-4

3e-4, poisson errors (new chi2) pull_prec=3e-4_poisson

(Edited to show plot names)

MannyMoo commented 5 years ago

Yeah, basically unchanged between 1e-3 and 3e-4. Hard to say if using Poisson errors actually improves things either. OK, try the different arguments to minimize and see if it makes any difference.

MannyMoo commented 5 years ago

Another way to potentially validate the fitter independent of the generator would be to simply generate the histograms directly:

Calculate the expected ratio in each bin of time & phase space.
For each bin, generate a Gaussian variable with a given width, add that to the expected value, and assign the Gaussian width as the uncertainty on that bin. You could take the Gaussian width for each bin from the uncertainty that bin from an example generated dataset.
Then fit the histos as usual. This skips the full phase space generation, so we can just check that we get out of the fit what we put in to the histograms.

MannyMoo commented 5 years ago

Actually, a slightly easier way is:

Make the ratio histograms from a data set as usual
For each bin of time and phase space, set the value of the ratio in that bin to be the expected value plus a Gaussian variable generated with zero mean and width equal to the uncertainty on that bin.
Do the fits.

MannyMoo commented 5 years ago

Possibly scipy.optimize.minimize just isn't accurate enough for our needs. So we can try implementing something in MINT. You'll need to write a class that inherits from MINT::Minimisable(source and header) and implements the getVal member function to return the chi-squared. In its constructor, you'll need to pass it the data histograms and the input variables needed to calculate the chi-squared (\<t>, <t^2>, Rb, Xb). The fit parameters (Re(z), Im(z), Re(deltaz), Im(deltaz)) will be instances of MINT::FitParameter (doc), added to the MinuitParameterSet of your Minimisable derived class. These will then be used in getVal along with the input parameters to calculate and return the chi-squared.

NiallMcHugh commented 5 years ago

I've tried the only other method scipy offers, L-BFGS-B (default method is BFGS), which will return the inverse hessian matrix. Again, means look good but standard deviations are way off - it does say in the documentation that the hessian is an approximation however, which is likely the cause of this.

pull_LBFGSB

While I'm working on implementing something in Mint today, I'll try running it once more with a lower tolerance and see if that changes anything.

MannyMoo commented 5 years ago

OK, that seems to have gone far in the other direction, so the uncertainties are overestimated.

Once you have your Minimisable derived class that implements getVal, put the .cpp in src/Utils/ and the .h in Mint/ with the other public headers (make sure you prefix the file name with Mint/ in any #include). Then check that it builds OK.

To actually do the fit you'll need to pass a instance of your class to an instance of MINT::Minimiser (doc), which runs Minuit to do the minimisation.

Add the class name to Mint2/dict/Mint2Dict.xml, along with MINT::Minimiser, and #include the headers for it and Minimiser in Mint2/dict/Mint2Dict.h. Then build again. This generates the python bindings for the classes, so you can do in python

import Mint2
from ROOT import Minimiser, <your class>

binflipfitter = <your class>( ... )
minimiser = Minimiser(binflipfitter)
minimiser.doFit()

Then it looks like you can save the fit results to an ntuple by doing something like:

from ROOT import TFile
parset = minimiser.parSet()
resultsfile = TFile("results.root", "recreate")
resultstree = parset.makeNewNtpForOwner(resultsfile)
parset.fillNtp(resultsfile, resultstree)
resultstree.Write()
resultsfile.Close()

Then you can easily make a TChain of the results files for each dataset and plot the pulls. Or you can use your current code for pull plotting, whatever you prefer.

MannyMoo commented 5 years ago

Actually, I realised Minimiser is set up to calculate the errors for a log likelihood fit, which assumes that a change of 1 in the likelihood corresponds to a change of 1 sigma in a fit parameter. But for a chi-squared fit, it's a change of 0.5 in the chi-squared that determines the fit parameter uncertainty. So I've added an option to Minimiser to set what change in the function to use to determine the uncertainties. You can pull the changes from master. Then instead initialise the Minimiser as:

minimiser = Minimiser(binflipfitter, 0.5)

NiallMcHugh commented 5 years ago

I've added my header file at Mint2/Mint/binflipChi2.h and my source at Mint2/src/Utils/binflipChi2.cpp, then added MINT::Minimiser and MINT::binflipChi2 to the xml, and added their headers to the header file as well. Everything compiles OK, however in python I can't seem to import my class. I'm able to import both Minimiser and Minimisable without any issues.

What I have written in python is:

import Mint2
from ROOT.MINT import Minimiser, binflipChi2

binflipfitter = binflipChi2()

And the output I get is:

"ImportError: cannot import name binflipChi2"

I've pushed the .h and .cpp files I have so far which should just be a blank template I can add my member variables and write the chi2 in once I have this step working, hence the constructor doesn't take any arguments just now.

Is there something else I'm forgetting to do?

NiallMcHugh commented 5 years ago

Scratch that sorry - the issue was that Minimiser needs to be imported from ROOT.MINT, while I need to import binflipChi2 from just ROOT, so I have it working now.

MannyMoo commented 5 years ago

OK, good. One other catch is that you'll need to pass the lists of Rb & Xb, etc, to the binflipChi2 instance. Currently you have these in python lists, which aren't trivial to use on the C++ side. But you can convert them to std::vectors using ROOT, like:

mylist = [ ... ]
myvector = ROOT.vector('double')()
for val in mylist : myvector.push_back(val)

You can also do vectors of complex<double> like

myvector = ROOT.vector('complex<double>')

Then you can write your binflipChi2 constructor to accept std::vectors, and pass these from the python script.

NiallMcHugh commented 5 years ago

The way I've implemented it so far (should be pushed on my branch) is to use the array module within python which allows me to pass it as a C-style array (via a pointer). Would it be better to change that to vectors as you've described, or is that OK as it is?

MannyMoo commented 5 years ago

That would work too I guess. Using vectors is a bit cleaner since they can tell you their size (you don't have to pass it along with the pointer), and they support iterators. But it's up to you. If you have something working then no need to change it now.

MannyMoo commented 5 years ago

Looking at what you have just now, I think what's missing is to add the MINT::FitParameters for Re(z), Im(z), Re(deltaz) and Im(deltaz). You can add these as member variables to the binflipChi2 class. Then pass to the constructor the initial values and steps sizes for each - the step size should be a rough guess of the uncertainty. Then in the binflipChi2 constructor, initialise the FitParameters using this constructor like:

m_re_z("Re_z", MINT::FitParameter::FIT, re_z_initvalue, re_z_stepsize, 0, 0, getParSet()),

NiallMcHugh commented 5 years ago

I've added the FitParameters and pushed those changes earlier this morning. Currently my code is crashing at the point where I try to push a new TGraph onto my vector of TGraphs (line 77):

fits[i].push_back( TGraph(m_nbinsTime) );

I'm trying different things, such as creating a temporary TGraph and then pushing that back rather than calling the constructor in the push back or trying a different structure, but so far haven't got anything working. If there's anything obvious you can see that I'm missing let me know!

Edit: Fixed now. I was creating the nested vector without giving a size, and so was trying to push_back on something that didn't actually exist.

MannyMoo commented 5 years ago

Ah yeah, actually you just need

vector<vector<TGraph> > fits(2);

rather than

vector<vector<TGraph> > fits(m_nbinsPhase);

NiallMcHugh commented 5 years ago

Yeah no worries, I've changed that. I'm getting an odd error where the fit parameter list for the fit grows as I process each file, adding the fit parameters to the existing list instead of replacing them. I've not yet been able to work out why, and I need to head off now this afternoon but I'll look into that on Monday morning. If you have any ideas off the top of your head why that could be happening then let me know.

MannyMoo commented 5 years ago

That's because the FitParameters belong to a MINT::MinuitParameterSet, which you pass in the constructor with getParSet(). But if the Minimisable instance doesn't have its own parameter set, it uses a default, shared one (see here). So each new instance of binflipChi2 adds parameters to the same parameter set. The Minimisable base class can be passed its own MinuitParameterSet in its constructor. So in the binflipChi2 constructor, you should call the Minimisable constructor with a pointer to a new MinuitParameterSet. Make sure to delete the MinuitParameterSet in the destructor, as it won't be by default. Then each instance of binflipChi2 has its own parameter set.

NiallMcHugh commented 5 years ago

I've been debugging most of today, and have my code at a stage where I'm able to fit the faked data. I've checked that the values produced by calculating the top and bottom of the ratio (to fake bin counts) give the same ratio as my fit function to the ratios, which provides a reasonable indication that my fit values should be correct.

The trouble I'm having with doing this is still in how to calculate the errors however. I've attached some pull plots where I'm either taking the same percentage error as the generated data, or just using 5% of the theory value as the error. These still aren't quite right, although it is worth noting Mint is still telling me that the fits have failed in many cases, which could be the cause. I've tried playing around with the error values in the case where I can't extract them from the generated data, using e.g. sqrt(theory value) or 1% theory value, but these give errors in the fit values which are extremely small and so throw the pull plots way off (despite the actual fitted values being very good). Larger values such as 10% seem to have the opposite effect and smear the data too much for an accurate fit.

imDz imDz_5pc reDz reDz_5pc imZcp imZcp_5pc reZcp reZcp_5pc

Since the fit values match the values which I'm smearing to get the faked data, then I think the pull plots must be off because of one of three things:

1) The way I'm using Minimiser to perform the fit isn't working 2) There's an error in my chi2 calculation 3) There's an error somewhere in the structure of my code

I'll keep investigating the latter two tomorrow, but let me know if you have any other thought on what I could try. My latest version of the code should also be pushed on my branch.

MannyMoo commented 5 years ago

OK, it could just be down to failed fits (though I'm not sure why they should be failing). I've added a function to the Minimiser class to get the fit status, so if you pull the changes from master and rebuild you can do:

minimiser = Minimiser(...)
minimiser.doFit()
if minimiser.isConverged() :
    # success
else :
    # failed

MannyMoo commented 5 years ago

Looking over binflipChi2, I've not found any errors. The only potential issue is using

err = sqrt(pval);

when the bin is empty. If pval isn't a valid count it would give an unphysical error. So maybe it's better just to set that bin to zero in the faked histos as well.

A sanity check would be to create the faked histos with zero error in all bins (no smearing with a Gaussian), and then make the ratios and compare to the ratio graphs used in the fit. They should match up exactly if things are working correctly.

MannyMoo commented 5 years ago

It would be possible to make the pval and mval valid counts by normalising the calculation of F and Fbar in computeIntegrals, ie, scale all the F values by 1/(sum of F values), and similarly for Fbar. Then they're normalised to 1 event. So before passing them to binflipChi2, multiply them all by the number of D or Dbar events. Then pval and mval should be valid counts and you could just use their sqrt as errors. You don't need to scale the X values since they're insensitive to the normalisation of the amplitudes by definition.

MannyMoo commented 5 years ago

One other thing in the scan in computeIntegrals, I think you want

s13 = s13min + (i + 0.5) * ds13

and

s23 = s23min + (j + 0.5) * ds23

so you use the centre of the grid square rather than its bottom left corner. I doubt it'll make a significant difference though.

NiallMcHugh commented 5 years ago

Looking over binflipChi2, I've not found any errors. The only potential issue is using
err = sqrt(pval);
when the bin is empty. If pval isn't a valid count it would give an unphysical error. So maybe it's better just to set that bin to zero in the faked histos as well.

A sanity check would be to create the faked histos with zero error in all bins (no smearing with a Gaussian), and then make the ratios and compare to the ratio graphs used in the fit. They should match up exactly if things are working correctly.

I've already compared the ratios from the faked histograms to the ratio values used to fit, which was what I meant by it being a reasonable indication my fit was ok yesterday (Sorry if that comment was a little wordy!), so that should be ok.

I had considered that, although I was a little unsure of how to then handle the errors for the empty bins, or whether to just exclude them from the chi2 calculation?

NiallMcHugh commented 5 years ago

It would be possible to make the pval and mval valid counts by normalising the calculation of F and Fbar in computeIntegrals, ie, scale all the F values by 1/(sum of F values), and similarly for Fbar. Then they're normalised to 1 event. So before passing them to binflipChi2, multiply them all by the number of D or Dbar events. Then pval and mval should be valid counts and you could just use their sqrt as errors. You don't need to scale the X values since they're insensitive to the normalisation of the amplitudes by definition.

I've added the normalisation in now, as well as using the centre of the grid points and pushed the changes.

When you say multiply by no. of D0/D0bar events, do you mean separately in each individual bin or just by the total number in all bins?

MannyMoo commented 5 years ago

Just the total number, but separately for D0 (for F) & D0bar (for Fbar).

MannyMoo commented 5 years ago

Actually, maybe you should skip the bin if either of the +ve or -ve entries is zero, rather than just if the denominator is zero.

NiallMcHugh commented 5 years ago

I've plotted the chi2 as a function of each parameter while the others are fixed. In each case, the plots seem fine:

ReZcp reZcpChi2 ImZcp imZcpChi2 ReDz reDzChi2 ImDz imDzChi2

The code I used to make the plots was:

canvasList = []
g_chi2 = []
npoints = 100
for j in range(4) :
    if(j==0):
        parMin = -0.0035
        parMax = -0.006
    elif(j==1):
        parMin = -0.003
        parMax = -0.005
    elif(j==2):
        parMin = -0.0005
        parMax = -0.003
    elif(j==3):
        parMin = 0.004
        parMax = 0.006

    dz = (parMax-parMin)/npoints
    par = parset.getParPtr(j)
    g_chi2.append(ROOT.TGraph(npoints))
    for i in range(npoints) :
        val = parMin + i*dz
        par.setCurrentFitVal(val)
        chi2 = binflipfitter.getVal()
        g_chi2[j].SetPoint(i, val, chi2)

    par.resetToInit()

    canvasList.append(ROOT.TCanvas("{}".format(j), "{}".format(j)))
    canvasList[j].cd()
    g_chi2[j].SetMarkerStyle(5)
    g_chi2[j].SetMarkerColor(ROOT.kBlack)
    g_chi2[j].SetLineWidth(3)
    g_chi2[j].Draw("AP")

My binflipChi2 and Minimiser instances were initialised before this, and parset was extracted from Minimiser. I also hadn't performed any fitting process.

MannyMoo commented 5 years ago

Yep, they look pretty much perfect ... I really don't understand why the fits are failing then. You could try doing 2D scans in each pair of parameters, to see if there are any extreme correlations. I think worked out why calling fixAndHide wasn't working - the Minimiser does some setup in its constructor, so you'd need to fix the parameters via the binflipChi2 instance before making the Minimiser instance. You can call getParSet on your binflipChi2 instance to get its MinuitParameterSet.

Can you attach the text output of one (or a few) of the failed fits so I can see if there're any more clues in there?

NiallMcHugh commented 5 years ago

I've attached the output for a failed fit, and also for a successful one (no smearing) for reference. In each case I've assigned error = sqrt(val) where val is numerator/denominator which I calculate, and added a random variable using TRandom3::Gaus with width equal to the error, mean 0 which leads to the case where the fit has failed, as before. In both cases, I completely ignore empty bins in the 'real' data, both when generating faked data and when calculating the chi2.

Successful fit outSuccess.txt Failed fit outFailure.txt

I'd actually tried 2D scans this morning for Zcp and deltaZ (but not for combinations of parts of each yet). These looked OK, apart from the chi2 being much more sensitive to the imaginary parts than the real parts; you can also see on the plots I attached on Friday actually. This is also mentioned in the analysis note (pg 6), and so shouldn't be the issue. I can't attach the root files here, but can email them if you want me to. I've attached a couple of plots in the meantime.

Zcp zcp_chi2_2D

deltaZ dz_chi2_2D

NiallMcHugh commented 5 years ago

For completeness, I've attached 2D plots for all other combinations. Everything seems OK as far as I can tell:

Imaginary parts only imParts_chi2_2D Real parts only reParts_chi2_2D Re(dZ), Im(Zcp) reDz_imZcp_chi2_2D Re(Zcp), Im(dZ) reZcp_imDz_chi2_2D

MannyMoo commented 5 years ago

OK, when you don't do any smearing, the fit converges to a chi squared of effectively zero, which is what you'd expect. See the line:

 FCN=3.84023e-08 FROM MIGRAD    STATUS=CONVERGED      59 CALLS          60 TOTAL

And the covariance matrix is sensible - the correlations between the parameters are small, which matches what the scans above show. Correlations are shown by a rotation and stretching of the 2D plots.

When the fit fails, the chi squared is around 600. For a well behaved chi squared distribution, the (chi squared)/(NDOF) ~ 1. Can you count how many non-zero terms contribute to the chi squared? This gives you the number of degrees of freedom (NDOF).

The reason it's failed is that the correlations are close to 100 % between the parameters, so it can't find a minimum to the chi squared.

When you've done the scans, is this with or without the smearing?

NiallMcHugh commented 5 years ago

For the data set which I ran the failed fit on, there are 598 terms contributing - the rest are ignored due to the bin counts in the data being zero. If I neglect terms which don't contribute much (using a tolerance of 0.01 just now) then ~550 events are still retained.

The scans were produced with smearing - let me know if you want to see them without and I can make those plots.

NiallMcHugh commented 5 years ago

Correlations are shown by a rotation and stretching of the 2D plots.

Would we expect to see stretching but not rotations for larger correlations? If so, the way I've scaled the plots may affect their interpretation.

I've not always scaled the plots above equally in x and y due to the difference in sensitivity between different parameters making it difficult to see variation in one direction, e.g. on the deltaZ plot the range of y values plotted around Im(dZ) (in units of im(dZ)) is around 1/3 of the range around Re(dZ) on the x axis (in units of Re(dZ)).

MannyMoo commented 5 years ago

No, it'd always be a combination of rotation and stretching. Actually, I just realised I had the error definition wrong - it's a delta chi-squared of 1 that gives the uncertainties (for likelihood fits it's 0.5). So rather than

minimiser = Minimiser(binflipfitter, 0.5)

it should be

minimiser = Minimiser(binflipfitter, 1.)

I don't think that should affect the stability of the fits, but try changing it and see.

MannyMoo commented 5 years ago

Also, can you commit and push the version of the fitting script that's using binflipChi2?

NiallMcHugh commented 5 years ago

No problem, I've changed that but it doesn't seem to make a difference to the result as you say. I've added a new script (binflipAnalysis.py), to avoid getting rid of the previous code which was using scipy, which I've now pushed.

I've also pushed my current version of binflipChi2 to make sure everything is up to date, however I don't think there's any significant differences with that.

MannyMoo commented 5 years ago

OK, after some debugging, I think one issue is rounding errors arising from mixing float and double. I've pushed some changes to master of both Mint2 and AGammaD0Tohhpi0 so that it uses double for the R-values and chi squared. This seemed to stabilise things when I fit only for one parameter, but I still have issues when fitting for all four parameters. So I think you'll need to change it so that all the input variables (X, R, \<t>, <t^2>, F) are also doubles, and zcp & deltaz are complex<double> in getFits.

(I also made it so that m_fakeData is an int which is used as the seed for the random number generator, so the fake data is reproducible).

NiallMcHugh commented 5 years ago

Pull plots after changing all variables to Double (using faked data): reZcpPull imZcpPull reDzPull imDzPull

NiallMcHugh commented 5 years ago

Pull plots on the properly generated data (without experimental effects): reZcpPull imZcpPull reDzPull imDzPull

MannyMoo commented 5 years ago

OK, I've pushed the new time-dependent generator, which I think is working (I want to do some more validation still, but you can try it out). If you pull the changes from master on Mint2 and rebuild you'll pick it up. You run it in exactly the same way as before, with gen-pipipi0.py, but you don't have to generate the integrators first, just do the data generation step.

NiallMcHugh commented 5 years ago

New pull plots using the new generator: reZcpPull imZcpPull reDzPull imDzPull

Mostly OK but still a clear bias for Re(Zcp). Interestingly, looking at the plot of just the fit values gives a mean value which is consistent with -1*Re(Zcp)[actual value: -4.5e-03] - not sure whether this is a coincidence or could point towards what's going on. I'll start looking through my code to see if I can find any possible bugs.

reZcpMean

MannyMoo commented 5 years ago

OK, yeah, that's interesting, could point to a sign error somewhere. Again it could be in either the generator or the fitter. I'll check through the generator again.

MannyMoo commented 5 years ago

Not found anything yet. Re(z) corresponds to y, so to test if it is a sign error, you could generate with eg 3x larger y and see if you get the same effect.

NiallMcHugh commented 5 years ago

No problem, I'll try that and get back to you later with some results. I likewise haven't found much, apart from that making the pull plot using -Re(Zcp) is exactly what we'd like - maybe again just a coincidence and confirmation bias though.

reZcpPull_negVal

NiallMcHugh commented 5 years ago

I've just made up the pull plots for y = 0.0195 as you suggested. They're broadly similar but still with a large bias on Re(Zcp), although the plot of fit values for Re(Zcp) does rule out a simple sign error. reZcpPull imZcpPull reDzPull imDzPull reZcpMean

MannyMoo commented 5 years ago

OK, so we're still getting ~ +0.9 sigma bias to Re(zcp) regardless of its value. It may be an intrinsic bias due to binning the data. You could try changing the number of decay time bins, eg, try using 25 and 100 instead of 50, and see if this affects the bias.

NiallMcHugh commented 5 years ago

I've got the fits running again with the binning set differently as I picked up this morning in the analysis note that they use 10 decay time bins - I'll let you know if that appears to change the results. They also use bins of varying size, so if the binning does seem to change the results I can also look into implementing that.

NiallMcHugh commented 5 years ago

Plots using 10 decay time bins instead of 50: reZcp_pull imZcp_pull reDz_pull imDz_pull

MannyMoo commented 5 years ago

Interesting, so not a sign flip and not the decay time binning ... Try a larger number of decay time bins anyway. I guess you could also try increasing the number of phase space bins, say doubling it.

MannyMoo / AGammaD0Tohhpi0

Validate phase space generator and bin flip fitter #3