Add expected correlation for mocks

londumas commented 5 years ago

Add expected correlation for mocks. For Saclay mocks, we can add a "transfer function": xi_mocks = f (xi_CAMB). Worked on branch https://github.com/igmhub/picca/tree/add_transfer_function_mocks

xi_g_to_xi_F.txt

londumas commented 5 years ago

Here is the a plot of the current transfer function @jmarclegoff gave us. Could you produce one with a finer grid?

tranfer_function

xi_g_to_xi_F.fits.gz

londumas commented 5 years ago

@TEtourneau, could you give me permissions to read and copy all files in /global/u1/t/tetourne/Git/picca_data/ana/Out/debug_1024_52/from_transmissions/Correlations/. @jmarclegoff, could you copy the code you used to produce the following plot?

pastedgraphic-4

@jmarclegoff, here is a simple code to produce directly a fits file:

import fitsio
import scipy as sp
import matplotlib.pyplot as plt

def main():

    tranfer = sp.loadtxt('xi_g_to_xi_F.txt')
    out = 'xi_g_to_xi_F.fits.gz'

    h = fitsio.FITS(out,'rw',clobber=True)
    #head = [ {'name':'RPMIN','value':rp_min,'comment':'Minimum r-parallel'}]
    comment = ['Transfer function']
    h.write([tranfer],names=['T'],comment=comment,extname='T') #,header=head

    h.close()

    return
main()

londumas commented 5 years ago

@TEtourneau, I just created a branch to use the mock transfer function: add_transfer_function_mocks. Simply add tranfer-func-mock = < path to the following transfer function xi_g_to_xi_F.fits.gz > to the [model] section of your config.ini file. The file is above at the bottom of one of the comments. I did not manage to fit the correlation for the moment, you and @jmarclegoff, might be able to tweak it to work.

jmarclegoff commented 5 years ago

@londumas, concerning the transfer function, it takes about 1s per point on my laptop, so I can make a thiner grid but I am not sure it is really needed, the function is pretty smooth and can be easily interpolated. How many points would you want ? This transfer function is of course z dependent. So we should also define a grid in z. Sure, I will put everything in a fits file, otherwise in an ascii file that will get messy. Thanks for the exemple of fits code, good to see how to add comment and extname.

jmarclegoff commented 5 years ago

@londumas, concerning the plot, it was produced by Thomas using my code to compute the prediction. The code is part of the Saclay mock code. Do you want us just to copy it in a comment, so you can have a look to it, or do you want us to include it in picca ?

londumas commented 5 years ago

@jmarclegoff, for the plot in a comment it is perfect, so I can have a look at it. For the grid, let's have a working code and then worry about having it finer and depend on redshift, you are right.

jmarclegoff commented 5 years ago

@londumas, for the fit of the correlation function that did not work, which parameters did you leave free ? Is it possible to fix all parameters, just to get the chi^2 between the prediction and the mock, probably for r > 10, since below there is a clear problem. By the way, is there a way in GitHub to attach a comment to a previous comment to which it refers ?

londumas commented 5 years ago

@jmarclegoff simply use ">". For the code, I need you and @TEtourneau to test the branch and fit with the parameter free and fix, as you prefer. You know better than me.

jmarclegoff commented 5 years ago

@londumas, by simply use ">" you mean I should first copy the comment I want to answer and put ">" in front of it ?

I agree, it makes sense that we test the branch. We need to be able to use it, thanks for the implementation.

Here is the code for the prediction. I did not include class xi_Hamilton() : which just compute Hamilton (1992) xi(r,mu) formulas

#********************************************************************
def xi_predicted(xi_g, f, dfbar, delta_s=False, sigma=np.inf):
    # This function returns the predicted xi after applying f to a GRF g
    # g should be normalized such that sigma_g=1, which implies |xi_g| <1
    # see eq 2.6 in Font et al. 2012
    # the numerical integral is not reliable for 0.005 < x_g < 1
    # delta_s = True => add small scales in prediction
    if delta_s == True:
        def integrand(dl1, dl2, ds1, ds2):
            num = np.exp(-(dl1**2 + dl2**2 - 2*dl1*dl2*xi_g)/(2*(1-xi_g**2)))
            denum = 2*np.pi*np.sqrt(1 - xi_g**2)
            return ((f(dl1, ds1)/dfbar-1)*(f(dl2, ds2)/dfbar-1) 
                    * num/denum * np.exp(-0.5*(ds1**2+ds2**2))/(2*np.pi))
        return integrate.nquad(integrand, [[-sigma, sigma], [-sigma, sigma], [-sigma, sigma], [-sigma, sigma]])
    else:
        if ( (abs(xi_g) > 1) | (xi_g == -1) ) :
            print "******** error in xi_predicted, xi_g=", xi_g
            exit(0)
        if ( (xi_g > 0.995) & (xi_g < 1) ) :
            print "**** Warning, computation of xi_predicted not reliable for 0.995 < xi_g < 1 *****"
            print "should rather interpolate linearly bewteen 0.995 and 1."
        if (xi_g==1) :
            def integrand(dl):
                num = np.exp(-dl**2/2)
                denum = np.sqrt(2*np.pi)
                return (f(dl)/dfbar-1)**2 * num/denum
            return integrate.quad(integrand, -np.inf, np.inf)
        def integrand(dl1, dl2):
            num = np.exp(-(dl1**2 + dl2**2 - 2*dl1*dl2*xi_g)/(2*(1-xi_g**2)))
            denum = 2*np.pi*np.sqrt(1 - xi_g**2)
            return (f(dl1)/dfbar-1)*(f(dl2)/dfbar-1) * num/denum
        return integrate.dblquad(integrand, -np.inf, np.inf, lambda x: -np.inf, lambda x: np.inf)

#*************************************************************
class xi_prediction() :
# given a FGPA transformation F = exp[ -a exp (b G g) ]
# and a GRF g = delta + c eta, characterized by sigma_g and c  
# computes xi_g and the resulting xi_F
# see Font Ribera at al. (2012) eq 2.7
# DX is the width of the Gaussian smearing applied to the GRF field

    def F(self,delta):  # FGPA function
        return np.exp(-self.a * np.exp (self.b * delta) )

    def F_pdf(self,F):   # returns F x pdf(F)
#   F=exp[ -a exp(b g)] = exp[-exp(bg+ln(a))]  
#   where g = gaus(0,1) and bg+ln(a) = Gaus(ln(a),b)
#   pdf(F) = 1/(F tau \sqrt{2PI} b) exp [-(ln(tau)-ln(a))^2/(2b^2)]    
#       with tau = -ln(F)
        b=self.b
        a=self.a
        if (b==0):
            print "b=0 in F_pdf\n"
            exit(0)
        if ((F<=0) | (F>=1)):
            print "F=",F, "in F_pdf\n"
            exit(0)
        tau = - np.log(F)
        xx = np.log(tau/a)
        xx = np.exp(- xx*xx/2./b/b)
        # xx /= F * tau * np.sqrt(2 * np.pi) * b  # pdf of F
        xx /= tau * np.sqrt(2 * np.pi) * b      # F x pdf of F
        return xx

    def pdf_integrale(self,Fmax) :
        # compute int_0^Fmax pdf(F)dF using analytical primitive of pdf
        b=self.b
        a=self.a
        u = np.log(a) -np.log(-np.log(Fmax))
        xx = 0.5 * ( 1 + sp.special.erf( u/b/np.sqrt(2) ) )
        print 1-Fmax,a,b,1-xx
        return xx

    def ComputeFmean(self):     # checked wrt simu1D.cpp
        # pdf diverges in 0 and 1, making numerical integration unstable
        # 
        epsilon = 1E-8
        Fmean , error = sp.integrate.quad(self.F_pdf,0,1-epsilon,limit=50)
        #print epsilon,a,b,Fmean
        Fmean += 1- self.pdf_integrale(1-epsilon)
        return Fmean

    def __init__(self,a,b,G,sigma_g,c,DX=2.19):
        self.a = a
        self.b = b * G * sigma_g
        self.sigma_g = sigma_g
        self.c = c
        self.DX = DX
            #...........................       P(k)
        k = np.linspace(0,10,10000)
        #Pcamb = PowerSpectrum.P_0()
        Pcamb = P_0()
        P = Pcamb.P(k)
        W = np.exp(-k*k*DX*DX/2)
        P *= W*W
            #...........................       xi_g(r)
        r,xi = xi_from_pk(k,P)
        rmax=300
        cut=np.where(r<rmax)
        self.r=r[cut]
        self.xi=xi[cut]
        self.xi_Ham = xi_Hamilton(r,xi,rmax)
            #...........................        xi_g -> xi_F
        Fmean = self.ComputeFmean()
        xi_g=np.linspace(-0.99,1,11)
        xi_F = np.zeros(len(xi_g))
        for i in range(len(xi_g)):
            #print i, xi_g[i]
            xi_F[i], err = xi_predicted(xi_g[i], self.F, Fmean)
        self.xig2xiF=interpolate.InterpolatedUnivariateSpline(xi_g,xi_F)

#................. main function, returns the predicted xi_F
    def xi_F(self,r,mu):
        xi_g = self.xi_Ham.xi(self.c,r,mu) / self.sigma_g**2
        return self.xig2xiF( xi_g )

#.... BEWARE:  the next three functions return xig2xiF applied
# to the multipoles of xi_g. But since F(g) is non linear, these
# are not the exactly the multipoles of xi_F. 
    def xi0_F(self,r):
        xi_g = self.xi_Ham.xi0(self.c,r) / self.sigma_g**2
        return self.xig2xiF( xi_g )

    def xi2_F(self,r):
        xi_g = self.xi_Ham.xi2(self.c,r) / self.sigma_g**2
        return self.xig2xiF( xi_g )

    def xi4_F(self,r):
        xi_g = self.xi_Ham.xi4(self.c,r) / self.sigma_g**2
        return self.xig2xiF( xi_g )

londumas commented 5 years ago

@jmarclegoff, yes copy the comment in front of ">". By the way "``` long text ```" allows to quote a long code, I edited your last comment for that.

TEtourneau commented 5 years ago

@londumas You should now be able to access /global/u1/t/tetourne/Git/picca_data/ana/Out/debug_1024_52/from_transmissions/Correlations/, it's part of the desi group.

Concerning picca, I tested the branch, using the .fits.gz file you produced. It seems that the model for Xi doesn't go to zero properly:

jmarclegoff commented 5 years ago

I believe xi_F is not going to zero because the linear interpolation of xi_F(xi_g) in picca results in xi_F(xi_g=0) = 1E-4. In the xi_prediction() class I am using rather self.xig2xiF=interpolate.InterpolatedUnivariateSpline(xi_g,xi_F). I have attached a new xi_g_to_xi_F.fits.gz with 81 bins and most important a bin (xi_g,xi_F) = (0,-2E-13) please @TEtourneau try fitting again with this new file. In the future we could instead tabulate xi_F/xi_g vc xi_g which varies only between 0.05 and 0.17.

xi_g_to_xi_F.fits.gz

TEtourneau commented 5 years ago

@jmarclegoff I think you linked the wrong file. It's exactly the same as before.

londumas commented 5 years ago

@jmarclegoff, is there a reason why it is not (xi_g,xi_F) = (0,0), but it is (0,0)? If it is simply numerical issues, but the exact computation gives (0,0) let's add by hand (0,0) to the file.

jmarclegoff commented 5 years ago

@londumas physically is is clear that if g(x) et g(y) are uncorrelated and F=F(g), then F(x) and F(y) will be uncorrelated, so it should indeed be (0,0). This is numerical uncertainty related to the accuracy of the computation of . Instead of adding (0,0) in the file, I could also modify xi_predicted() such that it returns 0 if x_g =0. Are you ok to modify picca such that it expects xi_F/xi_g vs xi_g ?

londumas commented 5 years ago

@jmarclegoff,

instead of adding (0,0) in the file, I could also modify xi_predicted() such that it returns 0 if x_g =0.

That sounds perfect.

Are you ok to modify picca such that it expects xi_F/xi_g vs xi_g ?

It is more complicated to implement at the picca/fitter2 level. I thus prefer xi_F vs xi_g.

I think you linked the wrong file. It's exactly the same as before.

Could you send the proper file?

jmarclegoff commented 5 years ago

I must have take the file from the wrong directory. This one should be ok xi_g_to_xi_F.fits.gz

TEtourneau commented 5 years ago

I think it's still the same file

TEtourneau commented 5 years ago

Sorry my bad, it's fine this time. I'm running the fitter.

TEtourneau commented 5 years ago

Here is what I get when using the model: tranfer-func-mock = xi_g_to_xi_F.fits.gz

The chi2 is 1787. It doesn't seem to change much. I will try to fit with (bias, beta) fixed to the predicted value for this redshift.

londumas commented 5 years ago

@TEtourneau, The model is missing the window function also. Could you take care of that?

TEtourneau commented 5 years ago

Is it not taken into account in Xi_F(xi_g) ?

londumas commented 5 years ago

@TEtourneau, no. @jmarclegoff do you confirm that?

jmarclegoff commented 5 years ago

@TEtourneau are you sure of your plot ? This looks pretty much like the normal model of picca. @londumas This is correct, the Gaussian window function is not included in xi_F(xi_g) it should be included in P(k).

TEtourneau commented 5 years ago

Yes I know. The only thing I changed between the two fits is the line in: [model] tranfer-func-mock = data/xi_g_to_xi_F.fits.gz Maybe I'm missing something ?

But in fact, it changes a bit the fit, especially the value of bias_eta:

Fit without transfer function: chi2 = 1787.79 ap = 1.02 at = 0.996 beta = 1.46 bias_eta = -0.176 par binsize = 13.4 per binsize = 12.3
Fit with transfer function: chi2 = 1787.21 ap = 1.02 at = 0.996 beta = 1.47 bias_eta = -0.559 par binsize = 13.5 per binsize = 12.2

jmarclegoff commented 5 years ago

@londumas When Thomas made pseudo data with xi(r,mu)=xi_pred and cov is the cov of the mocks he obtained the dotted lines in the plots at the bottom of https://desi.lbl.gov/trac/wiki/LymanAlphaWG/SaclayMocks They fit quite well the mocks (without free parameters) but at r < 10 and mu > 0.8. So I expect that the prediction, once included in picca, fits quite well the mocks and I am surprised by Thomas new plots. In addition we see that the prediction and the usual model give curves which are identical by eye. The fit parameters are also very close, only bias_eta changes significantly. But this depends how you define bias_eta in the framework of the prediction, since in the prediction the bias is coming essentially through xi_F(xi_g). By the way if we want to compare to the nominal prediction what values of the parameters should we use, clearly at = ap =1, beta=1.6 and par and per binsize =0, but what about bias_eta. I would have said beta = f bias_eta / bias, with f = 1 and bias=1 since we do not apply a bias to g, which gives bias_eta=1.6, relative to matter at z=0. If we compare to matter at z=2.3, this will make bias_eta even larger. I am somewhat confused as to what is bias_eta in case we use the prediction as the model.

Let me make sure we agree on the prediction: a field g with P_g = (1+ beta mu^2)^2 P_camb(z=0) which can be converted to a prediction for xi_g. and then xi_F= xig2xiF (xi_g).

jmarclegoff commented 5 years ago

@londumas j'ai regardé un peu dans le code, guidé par Thomas. En fait il suffit de chercher les modif récente dans github, c'est pratique. Il me semble que dans data.py il faudrait faire qqchose comme de rajouter if self.tranfer_func_mock is None: avant les lignes 307 et 308

        xi *= self.z_evol[self.tracer1['name']](self.z, self.tracer1, **pars)*self.z_evol[self.tracer2['name']](self.z, self.tracer2, **pars)
        xi *= self.growth_function(self.z, **pars)**2

Sinon on applique xi_g_to_xi_F à G(z) g et non pas à g.

londumas commented 5 years ago

@jmarclegoff and @TEtourneau, I let you do the necessary changes to the branch. Normally you have all writing permissions for branches.

TEtourneau commented 5 years ago

I did the modifications (remove z_evol and growth_function when using xi_pred), but it doesn't change much: chi2=1786 ap=1.02 at=0.997 beta=1.41 bias_eta=-0.542 par binsize=12.5 per binsize=13.2

Is there still a bias dependency somewhere ?

jmarclegoff commented 5 years ago

@londumas Thomas removed the growth factor and it changed very little, even bias_eta changed only by 3%. I though we just had to add a condition before multiplying by the growth factor but clearly I do not understand how the code is working and we would need your help.

londumas commented 5 years ago

@TEtourneau could you push the changes you made to the branch? I will have a look. Could you also give the chi.ini and the config.ini you are using?

TEtourneau commented 5 years ago

I just pushed my commit. Here is the files:

chi2.ini

[data sets]
zeff = 2.29
ini files = Out/debug_1024_52/from_transmissions//Fit/config_cf.ini 

[fiducial]
filename = PlanckDR12/PlanckDR12.fits

[cosmo-fit type]
cosmo fit func = ap_at

[output]
filename = Out/debug_1024_52/from_transmissions//Fit/result_cf.h5

And config.ini:

[data]
name = LYA(LYA)xLYA(LYA)
tracer1 = LYA
tracer2 = LYA
tracer1-type = continuous
tracer2-type = continuous
filename = Out/debug_1024_52/from_transmissions//Correlations/e_cf.fits
ell-max = 6

[cuts]
rp-min = 0.
rp-max = 200.

rt-min = 0.
rt-max = 200.

r-min = 10.
r-max = 180.

mu-min = 0.
mu-max = 1.

[model]
model-pk = pk_kaiser
model-xi = xi
z evol LYA = bias_vs_z_std
growth function = growth_factor_no_de
tranfer-func-mock = data/xi_g_to_xi_F.fits.gz
# pk-gauss-smoothing = pk_gauss_smoothing

[parameters]

ap = 1. 0.1 0.5 1.5 free
at = 1. 0.1 0.5 1.5 free
bao_amp = 1. 0 None None fixed

sigmaNL_per = 3.24     0 None None fixed
1+f         = 1.966    0 None None fixed
growth_rate = 0.966    0 None None fixed

bias_eta_LYA  = -0.17  0.017 None None free
beta_LYA  = 1      0.1 None None free
alpha_LYA = 2.9    0   None None fixed

par binsize LYA(LYA)xLYA(LYA) = 4 0.4 0 None free
per binsize LYA(LYA)xLYA(LYA) = 4 0.4 0 None free

# par_sigma_smooth = 3 0.4 0 None free
# per_sigma_smooth = 3 0.4 0 None free

londumas commented 5 years ago

@jmarclegoff and @TEtourneau, I have very little time in the following 4 weeks. I will let you figure things out.

londumas commented 5 years ago

@TEtourneau, I just updated the "add_transfer_function_mocks" branch with master. Please don't forget to "git pull" next time you want to work on it.

igmhub / picca

Add expected correlation for mocks #562