Open Christopher-Bradshaw opened 4 years ago
Hello @Christopher-Bradshaw! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers:
Thanks! I think this is consistent with what we're doing in Corrfunc.theory.xi()
, so I agree with this change. Would you mind checking that Corrfunc.theory.xi()
and convert_3d_counts_to_cf()
now agree?
Hey @lgarrison I might be missing something (or just looking in entirely the wrong place) but it looks like Corrfunc.theory.xi()
doesn't use the LS estimator, just the basic DD/RR
.
No, you're absolutely right! I was mis-remembering that we also supported the natural estimator in convert_3d_counts_to_cf()
. Is there another consistency check we could do? Such as, supply analytic DR and RR to the LS estimator, and check that it asymptotes to the natural estimator (i.e. the result of Corrfunc.theory.xi()
) for a large number of randoms? I'm not entirely sure that makes sense... but maybe something along those lines? Basically, I want to make sure that we got the -1 correction right in all of our estimators. One can use a small number of data points (e.g. 10) when making this comparison to exaggerate the importance of the ND1 - 1
correction.
@Christopher-Bradshaw Thanks!
@lgarrison Since the pair-counts are designed to be the same for an auto-corr of D1
and a cross-corr of D1
with D1
, shouldn't the conversion from pair-counts to the cf be the same?
@lgarrison Would your gist to demonstrate how to correctly account for pair-counts in a Rmin=0.0
bin work?
It's definitely relevant, as it was designed to show that (ND-1)/V
is the right density to use for the randoms in the natural estimator: https://gist.github.com/lgarrison/1efabe4430429996733a9d29397423d2
That's why I was thinking that some kind of consistency check or limit should be possible to ensure that LS and the natural estimator are behaving the same with respect to the ND-1
factor.
That gist is actually really useful for what I am doing - I was using N
instead of N - 1
in some places too, so thanks!
The test I had locally that was failing basically just constructed data and randoms with the same clustering and then asserted that the auto-correlation was 0. With this change that is now true. We could maybe just do the same thing for all the places where counts are converted to a cf?
Also, just checking that by natural estimator you mean DD/RR?
Also, just checking that by natural estimator you mean DD/RR?
@Christopher-Bradshaw Yup!
@Christopher-Bradshaw @lgarrison Does anyone recollect what needs to happen with this PR?
I think that the current way that Corrfunc converts counts to a correlation function isn't quite right for autocorrelations. I think the normalization is slightly off (see equation 3 in the LS paper).
The effect of this is totally negligible when the number of pairs is high, and very small even when it isn't so this isn't a big deal... But I was confused why things weren't quite as I expected and it seems worth a fix.
Please let me know I am right about this - there's always a good change I've misunderstood something! This change is also pretty rough at the moment - I just wanted to have something to show. I'm happy to clean it up, write tests, etc later but didn't want to do that before checking this made sense!
Thanks!