Closed julienguy closed 1 year ago
Thanks @iprafols for the suggestions. I am going to do that and work on implementing the same approach for QSO Lya cross-correlation. I checked on the Y1 analysis that I was getting almost the same chi2 with this fast method with the binning of 2Mpc/h for the metal matrices. I still want to run more checks though. (So this PR is still work in progress)
Compare:
/global/cfs/cdirs/desicollab/users/jguy/lya/iron-tests-jg/vegafits/vega-3-0-0-16/lyaxlya_lyaxlyb_lyaxqso_lybxqso/logs/baseline_vegafit_3-0-0-16.out
* Result with new metal computation for lyalya and lyalyb:
/global/cfs/cdirs/desicollab/users/jguy/lya/iron-tests-jg/vegafits/vega-3-0-0-17/lyaxlya_lyaxlyb_lyaxqso_lybxqso/logs/baseline_vegafit_3-0-0-17.out
Ok, I see, so for now I'll flag it as a draft PR. Let me know when you want more feedback, but the results are really encouraging!
For SiII(1090), wavelength in the QSO restframe interval [1190A,1205A] cannot be considered for the computation of the matrix because they would correspond to an absorption at a redshift greater than that of the quasar. This test is in place in the full calculation, but it is ignored in the fast computation. However I verified this has a negligible effect on the metal matrix (see plot below). I made this test by commenting out the test zmet<zqso in the full matrix calculation. The effect has to be smaller for SiII(1093) and inexistent for other transitions > 1205A in the rest-frame, so I think we can accept this approximation (the difference is smaller than the fluctuation from one rt bin to the next when computed with the standard method).
I added the code to compute the cross-correlation metal matrix.
Example (takes 15s):
time picca_fast_metal_xdmat.py -i /global/cfs/cdirs/desi/science/lya/y1-kp6/iron-tests/deltas/delta-lya-3-0/Log/delta_attributes.fits.gz --drq /global/cfs/cdirs/desi/science/lya/y1-kp6/iron-tests/catalogs/QSO_cat_iron_main_dark_healpix_v0-altbal_zwarn_cut.fits --mode desi_healpix --out fast_metal_dmat_qso_x_lya_lowres.fits --rt-max 200 --np 150 --rp-min -300 --rp-max 300 --fid-Om 0.315 --fid-Or 7.963e-5 --abs-igm 'SiII(1190)' 'SiII(1193)' 'SiIII(1207)' 'SiII(1260)' --rebin-factor 3
New comparison with a more precise estimation with the standard method using a smaller rt range (<40 Mpc/h) allowing for rej=0.995.
The differences are smaller than in the previous post. One can imagine residual small differences coming from specifics of the survey that the fast method cannot correct for: for instance a slightly different redshift distribution of QSOs in some part of the footprint correlated with a higher S/N in the forests.
The agreement on the auto-correlation lya x lya is also good when running the std computation with a smaller rt range. See figures below (already posted on slack). So in my opinion this PR is ready for final review before merging.
Figure showing that the matrix elements are virtually independent of r_trans when computing the matrix with a sufficient number of pairs:
Figure comparing the std and fast method for the auto-correlation:
I made several modifications to make 'pylint' happier except for some 'Line too long' complaints.
In my opinion truncating those lines with \
makes the code harder to read.
Added script to compute the metal matrices using only the delta stack table that one can find in the delta_attributes file.
We assume in this code that the weights of pairs with the same longitudinal separation do not depend (on average) on the transverse separation.
With this hypothesis, the distortion is the same for all values of rtrans, and one can use the average weight as a function of wavelength to compute the response.
Example:
takes 30s (most of the time is spent to write the huge matrices)
Comparison of response to a delta_k correlation. blue: fast computation orange: result of the std picca_metal_dmat.py ran with option --coef-binning-model 2 which takes forever. The orange has some statistical fluctuation because of the large rejection factor that has to be used.
SiIII(1207)
SiII(1260)
Maybe we can test this on mocks?