lesgourg / class_public

Public repository of the Cosmic Linear Anisotropy Solving System (master for the most recent version of the standard code; GW_CLASS to include Cosmic Gravitational Wave Background anisotropies; classnet branch for acceleration with neutral networks; ExoCLASS branch for exotic energy injection; class_matter branch for FFTlog)
218 stars 291 forks source link

Weird Cl for galaxy density at different redshift bins #38

Open hsxavier opened 9 years ago

hsxavier commented 9 years ago

Hello, When generating galaxy number density with 'nCl' in more than one redshift bin, the cross term between bins shows a weird result (a sharp peak) for a few low ells -- see plot below. The position of the peak depends of the redshift bins used. For neighbouring bins, the peak moves to higher ell as the redshift increases. The use of a tophat selection function sharpens the peak while a gaussian smoothes it out. The input used to reproduce this result is the one below:

output = nCl l_max_lss = 50 selection = tophat selection_mean = 0.4, 0.5 selection_width = 0.05 nondiagonal = 1 root = ./output/test headers = yes format = class write parameters = yeap

class_dens1-dens2_output

hsxavier commented 9 years ago

I tried doing some debugging and I may have a clue on when this happens:

The transfer functions \Delta_l(k) (tr.transfer[][]), used to compute Cls from the primordial P(k), are highly oscillatory in k for low ell and become something close to a top-hat in k for l>=l_a, probably as an approximation.

For cross Cls like dens[1]-dens[2], two different transfer functions are used, each one with a different l_a. Let's call these two l_a's as l_a1 and l_a2 and assume that l_a1<l_a2. The weird points in the cross Cls appear for ells that are in the range l_a1<= ell < l_a2, that is, they appear for ells for which the approximation is used in only one of the transfer functions. It seems like this would go away if the approximation is turned on at the same ell for both transfer functions.

montanari commented 9 years ago

Hello,

by default the Limber approximation is used according to the following precision parameter:

ppr->l_switch_limber_for_cl_density_over_z=30.;

As you noticed , especially for bin cross-correlations this may introduce large errors in the angular spectra. To avoid it, the above parameter can be increased to a large number. Of course, also other precision parameters (by default optimal for CMB) may need to be adapted to still have fast LSS computations. You can add the following lines to the *.ini you are using:

# For density Cl, we recommend not to use the Limber approximation
# at all, and hence to put here a very large number (e.g. 10000); but
# if you have wide and smooth selection functions you may wish to
# use it; then 30 might be OK
l_switch_limber_for_cl_density_over_z=10000.

# Reduce the bessel sampling to obtain faster and less memory requiring results
selection_sampling_bessel=.5

# For large redshift bin, larger q_linstep values are still accurate (increase speed considerably)
q_linstep=1.

# Increase for higher accuracy especially at high l's
k_max_tau0_over_l_max= 2.5

This will give you a fast smooth output, stable if precision is pushed further. For different configurations (e.g. larger/narrower window function) you may need to adapt these parameters to get the optimal speed and accuracy. The precision parameters are described in include/common.h, and the default values are in source/input.c:

Cheers, Francesco

hsxavier commented 8 years ago

Hello, Francesco.

Thank you for the input.

I've been trying various precision parameters and it seems that for my goals (density and convergence Cls from z=0.3 to z=2.0 in tophat bins with halfwidth=0.02 up to l=3000 with errors <1%) the precision has to be pushed quite up, and it may take days to finish computation. Maybe I am doing something wrong... The parameters I've been trying out that change most the precision (and speed) seem to be:

l_logstep=1.026 # transfer l_linstep=25 # transfer hyper_sampling_flat = 12. # transfer hyper_phi_min_abs = 1.e-10 # transfer q_logstep_spline= 20. # transfer l_switch_limber_for_cl_density_over_z=10000. # transfer selection_sampling_bessel = 10 # transfer q_linstep = 1.0 # transfer k_max_tau0_over_l_max=20 # perturbations

I've used only l_switch_limber_for_cl_density_over_z=10000. Do you have any suggestions on which settings I should use?

Thanks, Henrique

On Sun, Feb 15, 2015 at 2:24 PM, Francesco Montanari < notifications@github.com> wrote:

Hello,

by default the Limber approximation is used according to the following precision parameter (see source/input.c):

ppr->l_switch_limber_for_cl_density_over_z=30.;

As you noticed , especially for bin cross-correlations this may introduce large errors in the angular spectra. To avoid it, the above parameter can be increased to a large number. Of course, also other precision parameters (by default optimal for CMB) may need to be adapted to still have fast LSS computations. You can add the following lines to the *.ini you are using:

For density Cl, we recommend not to use the Limber approximation

at all, and hence to put here a very large number (e.g. 10000); but

if you have wide and smooth selection functions you may wish to

use it; then 30 might be OK

l_switch_limber_for_cl_density_over_z=10000.

Reduce the bessel sampling to obtain faster and less memory requiring results

selection_sampling_bessel=.5

For large redshift bin, larger q_linstep values are still accurate (increase speed considerably)

q_linstep=1.

Increase for higher accuracy especially at high l's

k_max_tau0_over_l_max= 2.5

This will give you a fast smooth output, stable if precision is pushed further. For different configurations (e.g. larger/narrower window function) you may need to adapt these parameters to get the optimal speed and accuracy.

Cheers, Francesco

— Reply to this email directly or view it on GitHub https://github.com/lesgourg/class_public/issues/38#issuecomment-74419248 .

hsxavier commented 8 years ago

Hello,

I decided to use the following precision parameters, which I tested to lead to the required precision:

recfast_Nz0=100000 tol_thermo_integration=1.e-5 recfast_x_He0_trigger_delta = 0.01 recfast_x_H0_trigger_delta = 0.01 radiation_streaming_trigger_tau_c_over_tau = 100. evolver=0
tight_coupling_trigger_tau_c_over_tau_h=0.005 tight_coupling_trigger_tau_c_over_tau_k=0.008 radiation_streaming_approximation = 2 radiation_streaming_trigger_tau_over_tau_k = 240. ur_fluid_approximation = 2 ur_fluid_trigger_tau_over_tau_k = 50. k_min_tau0=0.002 k_step_sub=0.015 k_step_super=0.0001 k_step_super_reduction=0.1 start_small_k_at_tau_c_over_tau_h = 0.0004 start_large_k_at_tau_h_over_tau_k = 0.05 l_max_g=50 l_max_pol_g=25 l_max_ur=50 tol_perturb_integration=1.e-6 perturb_sampling_stepsize=0.01 l_logstep=1.08 l_switch_limber_for_cl_density_over_z=10000. selection_sampling_bessel = 2.5 q_linstep = 1.0 k_max_tau0_over_l_max=20

However, with these settings CLASS have problems generating nCl depending on the number of redshift bins: For 8 it seems to work fine (see red line in the plot below), for 20 it returns a weird result (see green line in the plot below) and for 40 it crashes due to memory problems: * glibc detected * ./class: munmap_chunk(): invalid pointer: 0x00002ad5c4565010 ***

class_density_cls_20x8_bins

montanari commented 8 years ago

Hi Henrique,

When including many and relatively narrow redshift bins, I usually rely on a cluster to obtain several precise Cl's.

About the strange behaviour above, I tried to run 20 bins within z=0.3 and z=2.0 using your values

l_max_lss =  3000
l_switch_limber_for_cl_density_over_z=10000.
selection_sampling_bessel = 2.5
q_linstep = 1.0
k_max_tau0_over_l_max=20

and leaving the other to the default ones. In this case the Cl' s (density+convergence) are smooth and do not show the strange behaviour (hence due to a poor precision in some of the other parameters set by hand).

I was not able to reproduce the error

* glibc detected * ./class: munmap_chunk(): invalid pointer: 0x00002ad5c4565010 ***

hence I would not know how to comment it further. It may be related to some specific option (or modification of the source code?) in your ini file that I am not including, and that causes a memory allocation issue. Valgrind should help finding the origin of the error. I guess that running it with few bins and poor precision should be ok to spot the origin of the problem.

Cheers, Francesco

hsxavier commented 8 years ago

Hello, Francesco.

With respect to the strange behaviour with 20 bins, it seems that a change in l_logstep creates this problem. Indeed the problem vanishes if this parameter is left alone.

The suggestion of breaking down the computation of Cls in many bins into many pairs of bins and use a cluster was really good! I realised that the number of Cls to be computed would increase almost 3x but it could be distributed in the cluster, so in the end this approach is faster.

This also allowed me to narrow down the source of problems I am having when computing the Cls. Of 40 top-hat bins from z=0.035 to z=2 with half-width 0.025, I only got (and always got) segmentation faults when one bin is at z=0.035 (an example .ini file is attached). Is this simple to fix? I tried using valgrind but it was taking quite a while to run.

Cheers, Henrique

On Sat, Oct 10, 2015 at 2:17 PM, Francesco Montanari < notifications@github.com> wrote:

Hi Henrique,

When including many and relatively narrow redshift bins, I usually rely on a cluster to obtain several precise Cl's.

About the strange behaviour above, I tried to run 20 bins within z=0.3 and z=2.0 using your values

l_max_lss = 3000 l_switch_limber_for_cl_density_over_z=10000. selection_sampling_bessel = 2.5 q_linstep = 1.0 k_max_tau0_over_l_max=20

and leaving the other to the default ones. In this case the Cl' s (density+convergence) are smooth and do not show the strange behaviour (hence due to a poor precision in some of the other parameters set by hand).

I was not able to reproduce the error

* glibc detected * ./class: munmap_chunk(): invalid pointer: 0x00002ad5c4565010 ***

hence I would not know how to comment it further. It may be related to some specific option (or modification of the source code?) in your ini file that I am not including, and that causes a memory allocation issue. Valgrind should help finding the origin of the error. I guess that running it with few bins and poor precision should be ok to spot the origin of the problem.

Cheers, Francesco

— Reply to this email directly or view it on GitHub https://github.com/lesgourg/class_public/issues/38#issuecomment-147110947 .

montanari commented 8 years ago

Hi Henrique,

I was not able to reproduce the segmentation fault with my compiler GCC 4.8.4, and the results seem reasonable.

However, when running with the following parameters in the ini (similar to the one you passed me, but I reduce the precision since now I am not interested in the result itself) with class v2.4.4:

output = nCl
number count contributions = density
l_max_lss = 500
selection = tophat
selection_mean = 0.035
selection_width =  0.025
non_diagonal = 0
root = output/break-1-20_
headers = yes
format = class
write parameters = no

l_switch_limber_for_cl_density_over_z=10000.       # transfer
selection_sampling_bessel = 2.                    # transfer
q_linstep = 100.0                                    # transfer
k_max_tau0_over_l_max=2.                           # perturbations

non linear = halofit

I found a possible memory leak associated to libgomp:

$valgrind --tool=memcheck --leak-check=yes --leak-check=full --show-leak-kinds=all ./class break-1-20.ini
==5167== Memcheck, a memory error detector
==5167== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==5167== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==5167== Command: ./class break-1-20.ini
==5167== 
==5167== 
==5167== HEAP SUMMARY:
==5167==     in use at exit: 2,888 bytes in 6 blocks
==5167==   total heap usage: 165,520 allocs, 165,514 frees, 912,056,006 bytes allocated
==5167== 
==5167== 40 bytes in 1 blocks are still reachable in loss record 1 of 4
==5167==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5167==    by 0x4C2CF1F: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5167==    by 0x51417F8: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==5167==    by 0x51459FA: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==5167==    by 0x4650CD: perturb_init (perturbations.c:264)
==5167==    by 0x401AAE: main (class.c:36)
==5167== 
==5167== 192 bytes in 1 blocks are still reachable in loss record 2 of 4
==5167==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5167==    by 0x51417A8: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==5167==    by 0x5145B8E: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==5167==    by 0x4650CD: perturb_init (perturbations.c:264)
==5167==    by 0x401AAE: main (class.c:36)
==5167== 
==5167== 864 bytes in 3 blocks are possibly lost in loss record 3 of 4
==5167==    at 0x4C2CC70: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5167==    by 0x4012E54: allocate_dtv (dl-tls.c:296)
==5167==    by 0x4012E54: _dl_allocate_tls (dl-tls.c:460)
==5167==    by 0x5354DA0: allocate_stack (allocatestack.c:589)
==5167==    by 0x5354DA0: pthread_create@@GLIBC_2.2.5 (pthread_create.c:500)
==5167==    by 0x5145905: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==5167==    by 0x4650CD: perturb_init (perturbations.c:264)
==5167==    by 0x401AAE: main (class.c:36)
==5167== 
==5167== 1,792 bytes in 1 blocks are still reachable in loss record 4 of 4
==5167==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5167==    by 0x51417A8: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==5167==    by 0x51454D5: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==5167==    by 0x514447B: GOMP_parallel_start (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==5167==    by 0x48FDDE: spectra_cls (spectra.c:1878)
==5167==    by 0x492DCB: spectra_init (spectra.c:1247)
==5167==    by 0x401B32: main (class.c:56)
==5167== 
==5167== LEAK SUMMARY:
==5167==    definitely lost: 0 bytes in 0 blocks
==5167==    indirectly lost: 0 bytes in 0 blocks
==5167==      possibly lost: 864 bytes in 3 blocks
==5167==    still reachable: 2,024 bytes in 3 blocks
==5167==         suppressed: 0 bytes in 0 blocks
==5167== 
==5167== For counts of detected and suppressed errors, rerun with: -v
==5167== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

It is a dubious leak, but since you have segmentation fault for similar consigurations (i.e. when computing redshift bin for z<0.05), it may be relevant.

Cheers, Francesco