manodeep / Corrfunc

⚡️⚡️⚡️Blazing fast correlation functions on the CPU.
https://corrfunc.readthedocs.io
MIT License
163 stars 50 forks source link

Strange peak in correlation function of N-Body simulation #188

Closed fgmaion closed 5 years ago

fgmaion commented 5 years ago

General information

Issue description

When calculating the correlation function for an N-Body simulation, a strange peak appears after the BAO peak.

Expected behavior

The correlation function is expected to go smoothly to zero at scales of 200*h^-1, and is supposed to be well behaved at this scale since I am working with MultiDark simulation which is performed in a comoving box of 1Gpc size.

Actual behavior

There appears a peak at scales of 200h^-1 in the plot of r^2*xi(r) which was not expected, and the correlation seems to become unstable, producing large oscillations.

What have you tried so far?

I have tried changing the r_max, the number of bins, the bin refinement factor. I also tried calculating the correlation function by calculating counts DD, DR and RR and using convert_to_correlation_function instead of using the function xi, but the result was the same.

Minimal failing example

Image of the plot of r^2*xi(r) by r. corrfunc

Code for producing my result, where X, Y, Z were taken from the MultiDark simulation. corrfunc.txt

import Corrfunc

# rest of sample code goes here...
lgarrison commented 5 years ago

I think this is likely due to finite volume effects in the MultiDark box; 200 Mpc/h is only 1/5th of the box size, so the quantization of modes is very noticeable. But you could check this by running another correlation function code (like TreeCorr or kdcount) and comparing the result with Corrfunc's.

manodeep commented 5 years ago

Thanks @lgarrison. Yup - that's the first thing to check. @fgmaion If you have never used either of those codes before, you might find this gist helpful to figure out the syntax.

I could imagine that If the number of cells along one dimension are very few (say nx < 2*xbin_ref + 1), then we could have a duplicate cell-pair but turns out there is a check for that. So that particular bug is unlikely to be occurring.

@fgmaion Could you please re-run the code with a few lower values of bin refine factors -- say, (1, 1, 1) (2, 2, 1), etc? If the results are exactly the same between the runs, then there might be a problem in the bin refinements. (If the code takes too long, one option would be to subsample the particle distribution)

manodeep commented 5 years ago

Simply attaching the code in corrfunc.txt so it renders inline:

import Corrfunc.theory.xi as xi
import numpy as np
import matplotlib.pyplot as plt

boxsize = 1000.0 #boxsize is 1Gpc
nthreads= 8

rmin = 1.0
rmax = 250
nbins = 70
rbins = np.logspace(np.log10(rmin), np.log10(rmax), nbins+1)

results = xi(boxsize, nthreads, rbins, X, Y, Z, verbose = True, xbin_refine_factor = 7, ybin_refine_factor =7, zbin_refine_factor = 5)

R = []
Xi = []
for i in range(nbins):
    Xi.append(results[i][3] ) #Correlation function
    R.append(0.5*(rbins[i]+rbins[i+1]) )   #average value of r for ith bin

RR=np.asarray(R)
XX = np.asarray(Xi)
plt.plot(RR,RR**2*Xi)
plt.xlabel('$r(h^{-1}Mpc)$')
plt.ylabel('$r^2\\xi(r)(hMpc^{-1})$')
fgmaion commented 5 years ago

I have re-run the code with the new bin-refinement factors and the results are in the images below 211 for the (2,1,1)

111 and this is for the (1,1,1).

It seems there was no change. Will do the calculation with the other codes in the next few days.

lgarrison commented 5 years ago

@fgmaion Have you had a chance to compare the result with other codes? If not, do you mind if I go ahead and close the issue?

fgmaion commented 5 years ago

No problem, I closed it. When I get a chance to do it I'll take a look at the results and I'll come into contact again.

Thank you for your time.

lgarrison commented 5 years ago

No problem! Feel free to re-open this later on.