brinckmann / montepython_public

Public repository for the Monte Python Code
MIT License
93 stars 81 forks source link

Going around multimodal #30

Closed lanny222 closed 1 year ago

lanny222 commented 5 years ago

Hi, im doing a few runs for a few different f(T) models using planck 2015 + JLA + wiggleZ_BAO, but a few of my main parameters seem to be outputting a non gaussian result such as

are these simply intrinsic properties of the model (which im supposing they arent since other similar papers published show they are meant to be gaussian), or a problem with my .param file, or something that will smooth out with more and more points taken per chain?

brinckmann commented 5 years ago

Hi,

Likely it is just that your chains are not properly converged. You want an R-1 smaller than about 0.01. However, note that the accuracy of the Gelman-Rubin convergence criterium depends on the number of chains, so if you have a very low number of chains it will be inaccurate and you may not be able to rely on conventional wisdom for when to consider your chains converged. I usually recommend at least 4 chains and personally try to use 8 chains if that's feasible for the system I'm running on.

For multimodal distributions it may be beneficial to instead use the MultiNest sampler with the flag -m ns, although in this particular case that doesn't seem necessary as the modes are very close in parameter space (assuming your exploration did not miss any other modes!)

Best, Thejs

lanny222 commented 5 years ago

Im running on a cluster with 20 chains per run, and all R-1 are well below 0.01, maybe this is too many chains? but i thought the more chains the better @brinckmann

lanny222 commented 5 years ago

--> Scanning file ../chains/test/2018-12-10_20000001.txt : Removed 6950 points of burn-in, and first 25 percent, keep 51519 steps 2018-12-10_2000000__2.txt : Removed 506 points of burn-in, and first 25 percent, keep 57105 steps 2018-12-10_20000003.txt : Removed 3549 points of burn-in, and first 25 percent, keep 54877 steps 2018-12-10_20000004.txt : Removed 1322 points of burn-in, and first 25 percent, keep 56101 steps 2018-12-10_2000000__5.txt : Removed 4137 points of burn-in, and first 25 percent, keep 54363 steps 2018-12-10_20000006.txt : Removed 671 points of burn-in, and first 25 percent, keep 56952 steps 2018-12-10_20000007.txt : Removed 2013 points of burn-in, and first 25 percent, keep 56238 steps 2018-12-10_2000000__8.txt : Removed 2912 points of burn-in, and first 25 percent, keep 54954 steps 2018-12-10_20000009.txt : Removed 804 points of burn-in, and first 25 percent, keep 56877 steps 2018-12-10_200000010.txt : Removed 2196 points of burn-in, and first 25 percent, keep 55701 steps 2018-12-10_2000000__11.txt : Removed 5331 points of burn-in, and first 25 percent, keep 53079 steps 2018-12-10_200000012.txt : Removed 4204 points of burn-in, and first 25 percent, keep 54061 steps 2018-12-10_200000013.txt : Removed 1342 points of burn-in, and first 25 percent, keep 56052 steps 2018-12-10_2000000__14.txt : Removed 3501 points of burn-in, and first 25 percent, keep 54112 steps 2018-12-10_200000015.txt : Removed 1217 points of burn-in, and first 25 percent, keep 55563 steps 2018-12-10_200000016.txt : Removed 1136 points of burn-in, and first 25 percent, keep 55549 steps 2018-12-10_2000000__17.txt : Removed 1890 points of burn-in, and first 25 percent, keep 55180 steps 2018-12-10_200000018.txt : Removed 1529 points of burn-in, and first 25 percent, keep 55463 steps 2018-12-10_2000000__19.txt : Removed 2471 points of burn-in, and first 25 percent, keep 54439 steps 2018-12-10_2000000__20.txt : Removed 2625 points of burn-in, and first 25 percent, keep 54312 steps --> Computing mean values --> Computing variance --> Computing convergence criterium (Gelman-Rubin) -> R-1 is 0.005001 for omega_cdm 0.005835 for b 0.000677 for A_cib_217 0.000661 for xi_sz_cib 0.000475 for A_sz 0.000516 for ps_A_100_100 0.000381 for ps_A_143_143 0.000633 for ps_A_143_217 0.000643 for ps_A_217_217 0.000563 for ksz_norm 0.000251 for gal545_A_100 0.000345 for gal545_A_143 0.000673 for gal545_A_143_217 0.000484 for gal545_A_217 0.000787 for calib_100T 0.000397 for calib_217T 0.002873 for A_planck 0.000723 for alpha 0.000748 for M 0.000564 for beta 0.000441 for Delta_M 0.005001 for Omega_m 0.005001 for H0 0.004901 for sigma8

brinckmann commented 5 years ago

Can you show me your parameter file, please? It looks like you only have one verying cosmological parameter and everything else fixed, is that correct?

lanny222 commented 5 years ago

@brinckmann

data.experiments=['Planck_highl','Planck_lowl','lowlike','JLA','WiggleZ_bao'] data.over_sampling=[1, 4, 3]

data.parameters['omega_cdm'] = [0.11919, None, None, 0.0027, 1, 'cosmo'] data.parameters['b'] = [0.25, None, None, 0.1, 1, 'cosmo']

data.parameters['A_cib_217'] = [ 61, 0, 200, 7, 1,'nuisance'] data.parameters['cib_index'] = [ -1.3, -1.3, -1.3, 0, 1,'nuisance'] data.parameters['xi_sz_cib'] = [ 0.13, 0, 1, 0.3, 1,'nuisance'] data.parameters['A_sz'] = [ 6.86, 0, 10, 1.9, 1,'nuisance'] data.parameters['ps_A_100_100'] = [ 222.9, 0, 400, 30, 1,'nuisance'] data.parameters['ps_A_143_143'] = [ 38, 0, 400, 8, 1,'nuisance'] data.parameters['ps_A_143_217'] = [ 35.2, 0, 400, 10, 1,'nuisance'] data.parameters['ps_A_217_217'] = [ 102.6, 0, 400, 11, 1,'nuisance'] data.parameters['ksz_norm'] = [ 0, 0, 10, 4.2, 1,'nuisance'] data.parameters['gal545_A_100'] = [ 6.75, 0, 50, 1.8, 1,'nuisance'] data.parameters['gal545_A_143'] = [ 9.41, 0, 50, 1.8, 1,'nuisance'] data.parameters['gal545_A_143_217'] = [ 19.28, 0, 100, 4.2, 1,'nuisance'] data.parameters['gal545_A_217'] = [ 81.7, 0, 400, 7.9, 1,'nuisance'] data.parameters['calib_100T'] = [ 998.59, 0, 3000, 0.73, 0.001,'nuisance'] data.parameters['calib_217T'] = [ 995.89, 0, 3000, 1.4, 0.001,'nuisance'] data.parameters['A_planck'] = [100.028, 90, 110, 0.25, 0.01,'nuisance'] data.parameters['alpha'] = [0.15 , None, None, 0.001, 1, 'nuisance'] data.parameters['M'] = [-19.02, None, None, 0.004, 1, 'nuisance'] data.parameters['beta'] = [3.559, None, None, 0.02, 1, 'nuisance'] data.parameters['Delta_M'] = [-0.10, None, None, 0.004, 1, 'nuisance']

data.parameters['Omega_m'] = [0.3, 0, 0.6, 0,1, 'derived'] data.parameters['H0'] = [0, None, None, 0, 1, 'derived'] data.parameters['sigma8'] = [0.8, None, None, 0, 1, 'derived']

data.cosmo_arguments['sBBN file'] = data.path['cosmo']+'/bbn/sBBN.dat' data.cosmo_arguments['k_pivot'] = 0.05

data.cosmo_arguments['N_ur'] = 2.0328 data.cosmo_arguments['N_ncdm'] = 1 data.cosmo_arguments['m_ncdm'] = 0.06 data.cosmo_arguments['T_ncdm'] = 0.71611 data.cosmo_arguments['Omega_b'] = 0.05 data.cosmo_arguments['T_cmb'] = 2.726

data.cosmo_arguments['output'] = 'mPk' data.cosmo_arguments['P_k_max_h/Mpc'] = 1.

data.N=10 data.write_step=5

brinckmann commented 5 years ago

You have only two cosmo parameters with a lot of nuisance parameters and oversampling, so your chains will accumulate a lot of points quickly, but won't necessarily sample your cosmological parameters very well. For future runs like this I would probably decrease or disable the oversampling, but for now you can try to just accumulate more points.

Another possible issue could be a less than optimal covmat. To check this, the easiest is to copy the covmat that is being used (i.e. <output_directory_name>.covmat if the covmat was updated during the run, otherwise the input covmat) to a file in the same directory named inv_fisher.mat and pass the --plot-fisher flag when analyzing (you may need to use the flag --center-fisher in order to center the ellipse on the bestfit).

If the covmat is neatly aligned with the axis of your posterior you are probably fine (although similar width and length is also preferable). If you suspect your covmat is not good for properly exploring your parameter space you can always force the code to compute a new one with --want-covmat when analyzing (even if the run is on-going: as long as the update flag is enabled the, which it is by default, the chains will automatically load the newly computed covmat). Note that technically this means the part of the chains before the new covmat was computed is not Markovian and should not be included in a final analysis, but you may find this does not significantly impact your results.

brinckmann commented 5 years ago

It's a bit hard to say. I would probably start a new run in a new folder, starting from a covmat computed from these chains and no oversampling. That said, it looks like it could be a real effect rather than a convergence issue, so maybe you want to look into using MultiNest instead, to make sure it's properly sampled.

lanny222 commented 5 years ago

@brinckmann will do, ill try restarting from the currently computed covmat and let it run, if that doesnt work ill try using MultiNest. Thank you for the help