Jackknife capability? - Githubissues

chihway commented 4 years ago

At Texas sprint week we talked about it might be desirable to have the capability of getting jackknife covariances, this could be used to either compare with the analytical covariance later or provide error bars for the null tests (which one would need for calculating chi2).

@jpratmarti has volunteered to look into this.

empEvil commented 4 years ago

Judit and I have tried implementing the Treecorr Jackknife method as introduced in https://github.com/rmjarvis/TreeCorr/pull/96 into the TXpipe. So far we have only added the simplest form of Jackknife design, and we will have to address this in the future.

empEvil commented 4 years ago

For the two-point functions we have changed how results from the Treecorr calls is getting moved around the code. a lot more is done in the write_output part than before. This is because to calculate the combined covariance we need the treecorr objects.

This means the following changes:

The 3 calculate correlation functions, now only returns a object in the form of the treecorr correlation object.
Mike Jarvis will have to add a xi /wtheta to the nn correlation object, such that it get carried through, otherwise we would have to carry the rr, etc. objects through to the end.
In the write output part of the code, it now checks what type of correlation the result it is, then it adds the treecorr object to the appropriate list of objects (gg, ng, nn) for calculating the covariance. Then it reads out the results as before and adds that to the data-point.
finally it combines the 3 lists of correlations, and then calls treecorr to calculate the combined covariance.

Things to consider in the future:

better way to call jackknife, and make it better to ensure that the patches are the same. This is a potential issue already.
The 3 lists of objects are currently sorted by type, but they might not be sorted correctly by bin combinations.
There might be an issue with that it will calculate the cov xi+ xi- for each object and then add that to combined cov and then go to the next object. which means it is not properly sorting by type, will ask Mike about this

chihway commented 4 years ago

Pinging @zhzhuoqi @ajamsellem who have been doing jackknife testing with treecorr. Let's put discussions and progress here so everyone is on the same page.

Quick summary:

@ajamsellem is mainly testing the NN implementation
@zhzhuoqi is mainly testing NK and GK implementation

Main question is whether the treecorr implementation is consistent with a brute-force one. Once that is checked, the treecorr jk2 branch can merge, and then this branch on txpipe can be merged too.

@rmjarvis sorry for the scattered messages... please keep us posted when treecorr is ready for both of these tests

ajamsellem commented 4 years ago

Following up on comparing NN jackknife implementation: The following plots show that the jackknifed errors are in good agreement on small scales while the treecorr implementation seems to be overestimate the error more as the scales gets larger. (Label of "Original" is brute-force implementation.) This result is consistent in each redshift bin (one example bin shown) and in the total profile. Also, I've tried using both CalculateXi and estimate_cov functions, and they both give the same results.

Full: jackknife_test

One Redshift Bin Example: bin_2

rmjarvis commented 4 years ago

Thanks @ajamsellem. Could you post (or mail me or Slack me) some more detail about the code you used for this test? I'd like to understand the source of the disagreement, and unfortunately, the above plot isn't really enough for me to start investigating. Thanks!

rmjarvis commented 4 years ago

Thanks very much @ajamsellem and @zhzhuoqi. Both of your results and code were very helpful for figuring this out. It turns out that the handling of random catalogs required for doing the fast jackknife is quite a bit more subtle than my code originally assumed. I basically ended up completely redesigning that aspect of the code.

I think now NG, NK, and NN all handle randoms in a way that is consistent with the brute force jackknife calculations that you are doing. I added unit tests that do a brute force calculation on very small catalogs to compare with the automatic version, and now I'm getting consistent results for all of the various statistics. So I think that's working.

It even works now when some of the patches have no objects for one or more of the catalogs. So @ajamsellem, I think it should work even with the larger npatch value you originally had, where it had been giving errors.

So please pull the latest version of jk2 that I just pushed, and try your respective tests again. I think it should work better now! :)

ajamsellem commented 4 years ago

Running with the re-pulled jk2, the jackknives seemed to be in agreement. Thanks for fixing this @rmjarvis! jackknife_test

rmjarvis commented 4 years ago

Awesome! That's great. BTW, something I noticed, which might be causing the slight difference still visible here is that I think you have a slight error in your use of the np.cov function. The default is to use N-1 in the denominator, but AFAIU, the jackknife formula should just use N. I think the right version using the np.cov function would be

C = np.cov(Sg.T,bias=True)*(len(Sg)-1)

The difference is a factor of Npatch/(Npatch-1), which maybe is the difference between red and pink here. In TreeCorr, I do the matrix product by hand, but I use np.cov in the unit tests, and I discovered I needed this bias=True option to get it to match up.

ajamsellem commented 4 years ago

The above plot was actually made with a explicit calculation of error without using numpy to calculate the covariance. I did a comparison with and without thebias=True switch and using the covariance calculation. With the switch agrees with the above plot and without does change the errors slightly but I couldn't perceive the difference on the plot. That still leaves the difference between brute-force and Treecorr unaccounted for, but it's pretty small. It also seems as though the error is smaller than the error that comes from randomness in the process of assigning jackknives, so hopefully the differences can be considered negligible...maybe(?).

rmjarvis commented 4 years ago

Ah! Maybe it's the difference between using Erin's kmeans code and using TreeCorr's internal kmeans?

ajamsellem commented 4 years ago

Yah. That could be it. I think I can test that by using the patch_centers switch (with the centers I'm using for brute-force) versus npatch. If you're right I should get the same difference.

rmjarvis commented 4 years ago

You can't use the ra,dec centers from Erin's code directly. You need to convert them to the xyz format that TreeCorr is expecting.

# ra = RA in degrees from Erin's kmeans code
# dec = Dec in degrees
import coord
xyz = coord.CelestialCoord.radec_to_xyz(ra * (coord.degrees/coord.radians), 
                                        dec * (coord.degrees/coord.radians))
centers = np.column_stack(xyz)

Warning -- untested, but I think that should do it.

zhuoqizhang commented 4 years ago

Hi Mike @rmjarvis, I have checked the update on Treecorr, but didn't see changes in NK--I mean the NK calculateXi code looks the same as before. Could you please check if the file is up-to-date? Thanks!

rmjarvis commented 4 years ago

@zhzhuoqi Are you sure you are getting the latest jk2 branch? The code should look like this: https://github.com/rmjarvis/TreeCorr/blob/jk2/treecorr/nkcorrelation.py#L394

The next function after this is completely new, so if you see _calculate_xi_from_pairs, then you definitely have the new code. If not, you don't.

zhuoqizhang commented 4 years ago

@rmjarvis Thanks for checking! I'm using the right version indeed. I have re-run the NK random-subtraction jackknife, but it still gives the old results. I checked with @ajamsellem and I think our tests in principle should be the same. But, weirdly, it fixes in NN but not NK. My jk code is here, would you mind double-checking if the methods are consistent? https://github.com/zhzhuoqi/bazinga73/blob/master/NK_jackknife_test.ipynb

Thanks a lot!

rmjarvis commented 4 years ago

I think the issue is probably that Jupyter isn't re-importing the new TreeCorr code.

It looks like your cell numbers are pretty high at this point, so I'm guessing you were counting on import treecorr getting the new version after it was changed. But once python has imported something, this command won't re-import it.

The best solution is to restart the Jupyter kernel and run all the cells from scratch.

The faster solution (e.g. if some of the data manipulations before getting to the TreeCorr part take a long time, and you don't want to redo them) is reload(treecorr). I think that is supposed to work to reload the treecorr module, but I don't have much experience with it. So maybe try that first, and if it doesn't change things, do the kernel restart.

cf. https://support.enthought.com/hc/en-us/articles/204469240-Jupyter-IPython-After-editing-a-module-changes-are-not-effective-without-kernel-restart

rmjarvis commented 4 years ago

BTW, a way to check if that was the problem is to type:

import inspect
print(inspect.getsource(treecorr.NKCorrelation._calculate_xi_from_pairs))

If you get an AttributeError or if the code doesn't look like the code here, then you don't have the right TreeCorr version loaded.

zhuoqizhang commented 4 years ago

@rmjarvis Thanks so much for identifying the problem! I just loaded the right version, and jackknife is in good agreement in the NK too!

rmjarvis commented 4 years ago

The jk2 branch is now merged into TreeCorr master, so any use of jk2 in deployment scripts should switch over to use master instead.

ajamsellem commented 4 years ago

Went back and fed Erin's kmeans code patch centers to Treecorr and got basically the same results from the two codes, so that doesn't seem to be the source of the discrepancy. kmeans_comparison

ajamsellem commented 4 years ago

I caught a bug in the jackknives of the files I was using, which changed the output of the brute force code. This changed the differences in the plot I posted a few days ago (i.e. between brute-force and Treecorr) to be almost exactly a factor of two. I've been scouring over my code over the past few days, and can't seem to find anything obvious (like an extra 2 sitting around somewhere :)), but I'll look a little more. @rmjarvis, if you have any idea on the treecorr side where this might come from, please let me know. jackknife_test

rmjarvis commented 4 years ago

What bug did you find? I based my unit test on the code you sent me, and I thought it was all correct for how to do the brute force version. Here is the unit test: https://github.com/rmjarvis/TreeCorr/blob/master/tests/test_patch.py#L2044 Is there something in there that you think is wrong? I don't see any obvious place for a factor of 2 error.

ajamsellem commented 4 years ago

The bug wasn't in the code. When I changed the code to run on 30 patches, I forgot that the data files had assigned clusters & galaxies with 100 patches. The unit test seems fine/similar to what I have in my code, and your right that I don't see any place a factor of two would pop out. I'll keep looking in my code, and see if I find something.

rmjarvis commented 4 years ago

I guess sqrt(10/3) is close enough to 2 that it might be it. But when I ran my version of your code, I trimmed the input catalogs severely to make it run in a shorter time, so I had to redo the patch identification. And then I got perfect matches with your code and the current TreeCorr. So I'm a little confused how a factor of sqrt(Npatch) could be messing it up.

LSSTDESC / TXPipe

Jackknife capability? #54