manodeep / Corrfunc

⚡️⚡️⚡️Blazing fast correlation functions on the CPU.
https://corrfunc.readthedocs.io
MIT License
167 stars 53 forks source link

The convert_3d_counts_to_cf and convert_rp_pi_counts_to_wp function don't care about the weight. #334

Open MeloDi-23 opened 6 days ago

MeloDi-23 commented 6 days ago

General information

Issue description

I want to calculate the correlation function from a LSS catalogue. The LSS catalogue has some weighting factor, and I want to calculate the real weighted correlation function. But I find out that if I use DDrppi_mocks and convert_rp_pi_counts_to_wp, it will give the exactly the same wp no matter I use weighting or not. Is this ignored intentionally or by mistake?

Expected behavior

Get the correct $\xi$ and $w_p$ with weighting.

Actual behavior

The code does not care about the weighting.

What have you tried so far?

I checked the result of DDrppi_mocks, and I think it calculated the weight. Then I checked the source code of convert_rp_pi_counts_to_wp, and find out that it does not contain any code about the weight. So I'm trying to write my own version, making use of DD['weightavg'].

Minimal failing example

from Corrfunc.mocks import DDrppi_mocks
from Corrfunc.utils import convert_rp_pi_counts_to_wp
# read from some catalogue to galaxy and random.
Nd = len(galaxy)
Nr = len(random)

Nbins = 15
rp_min = 3
rp_max = 100
rp_bin = np.geomspace(rp_min, rp_max, Nbins+1)
r_p = (rp_bin[:-1]*rp_bin[1:])**0.5
pimax = 100

dd = DDrppi_mocks(autocorr=True, cosmology=1, nthreads=50, pimax=pimax, binfile=rp_bin,
                   RA1=galaxy['ra'], DEC1=galaxy['dec'], CZ1=galaxy['cz'], weights1=galaxy['w'], is_comoving_dist=True, weight_type='pair_product')
rr = DDrppi_mocks(autocorr=True, cosmology=1, nthreads=50, pimax=pimax, binfile=rp_bin,
                   RA1=random['ra'], DEC1=random['dec'], CZ1=random['cz'], weights1=random['w'], is_comoving_dist=True, weight_type='pair_product')
dr = DDrppi_mocks(
    autocorr=False, cosmology=1, nthreads=50, pimax=pimax, binfile=rp_bin, 
    RA1=galaxy['ra'], DEC1=galaxy['dec'], CZ1=galaxy['cz'], weights1=galaxy['w'], 
    RA2=random['ra'], DEC2=random['dec'], CZ2=random['cz'], weights2=random['w'], 
    is_comoving_dist=True, weight_type='pair_product')
w = convert_rp_pi_counts_to_wp(Nd, Nd, Nr, Nr, dd, dr, dr, rr, pimax=pimax, nrpbins=Nbins)
# calculate with weighting

dd = DDrppi_mocks(autocorr=True, cosmology=1, nthreads=50, pimax=pimax, binfile=rp_bin,
                   RA1=galaxy['ra'], DEC1=galaxy['dec'], CZ1=galaxy['cz'], is_comoving_dist=True)
rr = DDrppi_mocks(autocorr=True, cosmology=1, nthreads=50, pimax=pimax, binfile=rp_bin,
                   RA1=random['ra'], DEC1=random['dec'], CZ1=random['cz'], is_comoving_dist=True)
dr = DDrppi_mocks(
    autocorr=False, cosmology=1, nthreads=50, pimax=pimax, binfile=rp_bin, 
    RA1=galaxy['ra'], DEC1=galaxy['dec'], CZ1=galaxy['cz'], 
    RA2=random['ra'], DEC2=random['dec'], CZ2=random['cz'], 
    is_comoving_dist=True)
w_noweight = convert_rp_pi_counts_to_wp(Nd, Nd, Nr, Nr, dd, dr, dr, rr, pimax=pimax, nrpbins=Nbins)
# calculate without weighting

assert np.isclose(w, w_noweight).all()
manodeep commented 5 days ago

Thanks @MeloDi-23 for using Corrfunc and opening the issue. The utility uses only the raw pair counts and assumes weights are not involved. If you need a weighted correlation function, unfortunately, you will have to account for the semantics of the weights and the weighting scheme you have used and then normalise accordingly to get the correct correlation function.

MeloDi-23 commented 4 days ago

Thanks for the resply! Can you add some notice to that in the doc? e.g. in the Typical Tasks for Computing Correlation Functions part, the standard pipeline is first using DD- function to calculate the count and using convert- function to calculate the correlation. DD- function has parameters of weight, but convert- function does not utilize the result. I think this is rather misleading.

manodeep commented 4 days ago

Hmm - I suppose that's fair enough. Since you found the wording unclear - do you mind sending through a pull request with a proposed fix?