rmjarvis / TreeCorr

Code for efficiently computing 2-point and 3-point correlation functions. For documentation, go to
http://rmjarvis.github.io/TreeCorr/
Other
99 stars 37 forks source link

Implement multipole algorithm for 3pt correlations #167

Closed rmjarvis closed 10 months ago

rmjarvis commented 10 months ago

This PR has the implementation of the multipole algorithm for doing 3pt correlation functions.

It's nominally a new BinType, LogMultipole, but the point of it is to run that for the measurement and then convert to LogSAS binning, using the method corr.toSAS().

This method is several orders of magnitude faster than the normal tree-based 3pt calculation (which in turn is many orders of magnitude faster than the naive brute force algorithm). The GGG test of Map^3 takes many hours to run with the regular LogSAS binning, but takes under two minutes on my laptop using LogMultipole. This is using 10 threads, and a little over 1 GB of memory, so not unreasonable.

There are a few additional features I'd like to add related to this still, but the code on this branch is functional if people want to start trying it out.

Still TBD (on other PRs):

  1. Add a way for a LogSAS binned 3pt correlation object to use multipole for the calculation, rather than creating one with bin_type='LogMultipole' and then converting. I'm thinking an optional parameter of process like algorithm='multipole' which would make the appropriate LogMultipole object, run that, and then convert back using toSAS. Maybe even have this be the default for LogSAS, since I suspect this is what everyone will want to use normally. The old recursion will likely be of historical interest at most now.
  2. I need to add tests of the variance estimate. I didn't put any effort into making sure those were even plausibly accurate (never mind write tests of their accuracy) so they probably aren't.
  3. The multipole algorithm doesn't fully support patches yet. This is because the algorithm needs to complete Gn and Wn for all other cells before finishing up with each P1 cell to compute "Zeta" (what Porth et al call Upsilon). I think I know what to do for this, but there's a lot of refactoring of the patch stuff required. Basically, I need to have slightly larger patches for the catalogs with points 2 and 3. They only need to be max_sep larger than the regular patches. But then I wouldn't have to do any cross correlations (which wouldn't work for multipole anyway). I plan to have this alternate scheme be an option for all patch-based calculations. There may be other use cases where that is preferred over the current approach. But regardless, it is required to make multipole work.
  4. The GGG calculation is a little inaccurate for spherical coordinates at largish angles, which I think is due to the fourier expansion of the shear projection not quite following the same math as the planar case. I have to think a bit harder about whether and how to improve this. Currently the GGG calculation has sub-percent accuracy up to around a degree or so, but gets worse at large angles.

For now, the only patch based calculation it can currently handle is if cat1 has patches, and cats 2,3 do not. So to do an auto-correlation with patches, you need to make two copies of the catalog, one with patches and the other not. Then run it as a cross-correlation. E.g.

cat_nopatch = treecorr.Catalog(config)
cat_patch = treecorr.Catalog(config, npatch=npatch)
corr.process(cat_patch, cat_nopatch)

This will work, and it can produce patch-based covariances. But it won't really work for a lowmem application. Indeed it uses more memory than if you didn't use patches.

rmjarvis commented 10 months ago

Update: I figured out how to get the spherical projections to work right. Rather than rely on the Gn(n-3) thing to rotate all three shears, I just project the shears at points 2 and 3 properly and include them that way in the Gn array. Then I use n-1 and n+1 in the formulas to apply exp(-2i phi) to g1.

Now the spherical tests for GGG go out to 100 degrees and pass at 1.e-5 tolerance, so I think that's all good now.