Amlan1996 commented 1 year ago

Hi,

After installing montepython in my HPC cluster account, I have been trying to run it. To check whether it is working, I wanted to test it by running the file 'base2018TTTEEE.param' using mpi. But even though it runs, it produces only 2 or 3 chains instead of 16 (the number of chains I entered in my job file). Also, the chain files remain blank all the time. No output values are written inside them.

THE ERROR FILE SHOWS THE FOLLOWING:

Removing mpi version 2021.6.0 Use module list to view any remaining dependent modules. Removing mkl version 2022.1.0 Use module list to view any remaining dependent modules. Removing compiler version 2022.1.0 Use module list to view any remaining dependent modules. Removing tbb version 2021.6.0 Use module list to view any remaining dependent modules. Removing compiler-rt version 2022.1.0 Use module list to view any remaining dependent modules. Removing oclfpga version 2022.1.0 Use module list to view any remaining dependent modules. Traceback (most recent call last): File "/home/amlan/monte3/montepython/run.py", line 191, in safe_initialisation cosmo, data, command_line, success = initialise(custom_command) File "/home/amlan/monte3/montepython/initialise.py", line 67, in initialise data = Data(command_line, path) File "/home/amlan/monte3/montepython/data.py", line 361, in init self.initialise_likelihoods(self.experiments) File "/home/amlan/monte3/montepython/data.py", line 498, in initialise_likelihoods elem, elem, folder, elem)) File "", line 1, in File "/home/amlan/monte3/montepython/likelihood_class.py", line 863, in init Likelihood.init(self, path, data, command_line) File "/home/amlan/monte3/montepython/likelihood_class.py", line 70, in init self.read_from_file(path, data, command_line) File "/home/amlan/monte3/montepython/likelihood_class.py", line 182, in read_from_file "Be sure there is noting in it before doing this !") io_mp.ConfigurationError:

Configuration Error: /|\ No information on Planck_highl_TTTEEE likelihood was found in the /o\ example_out/log.param file. This can result from a failed initialization of a previous run. To solve this, you can do a ]$ rm -rf example_out Be sure there is noting in it before doing this ! During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "montepython/MontePython.py", line 40, in sys.exit(run()) File "/home/amlan/monte3/montepython/run.py", line 32, in run custom_command) File "/home/amlan/monte3/montepython/run.py", line 198, in safe_initialisation "The initialisation was not successful, resulting in a " io_mp.ConfigurationError:

Configuration Error: /|\ The initialisation was not successful, resulting in a potentially half /o\ created log.param. Please see the above error message. If you run the exact same command, it will not work. You should solve the problem, and try again

ALSO THE OUPUT FILE IS SHOWING SAME KIND OF ERROR:

Running Monte Python v3.5.0

with CLASS v3.2.0

Testing likelihoods for: ->Planck_highl_TTTEEE, Planck_lowl_EE, Planck_lowl_TT

Configuration Error: /|\ No information on Planck_highl_TTTEEE likelihood was found in the /o\ example_out/log.param file. This can result from a failed initialization of a previous run. To solve this, you can do a ]$ rm -rf example_out Be sure there is noting in it before doing this !

So it would be great if anyone could help me resolve this problem.

Thanks in advance.

Best regards, Amlan

brinckmann commented 1 year ago

Hi Amlan,

I'm a bit confused by the error. Did you specify the path to the Planck module in default.conf or the file you input? You also need to source the relevant bash script. E.g. on zsh I have in my .zshrc (for bash it would be .bashrc and clik_profile.sh): source ~/software/Planck18/code/plc_3.0/plc-3.01/bin/clik_profile.zsh

and in my default.conf I have root = '~/software' path['clik'] = root+'/Planck18/code/plc_3.0/plc-3.01/'

Best, Thejs

Amlan1996 commented 1 year ago

Hi Thejs,

Yes, I have specified all the paths you mentioned. I have sourced the .bashrc file just like you said. Here is a glance of my .bashrc file:

.bashrc

Source global definitions

if [ -f /etc/bashrc ]; then . /etc/bashrc fi

User specific environment

if ! [[ "$PATH" =~ "$HOME/.local/bin:$HOME/bin:" ]]

then

PATH="$HOME/.local/bin:$HOME/bin:$PATH"

fi

export PATH

Uncomment the following line if you don't like systemctl's auto-paging feature:

export SYSTEMD_PAGER=

User specific aliases and functions

module load python/3.6.3

module load ohpc

module load intel/2019

module load cfitsio/3.49

source /home/amlan/planck/code/plc_3.0/plc-3.01/bin/clik_profile.sh

Here is the default.conf file :

Move this file to default.conf, and adapt it to your needs

Fill in the relevant path to your personal distribution.

If you create a new file out of this one, please remember to call

MontePython.py with the option '--conf my.conf'

At minimum, this file should contain one line:

** path['cosmo'] = path/to/your/class

Note, if you are using a modified version of class, be sure that the

path contains the word class, otherwise the code might not recognise

it.

If you want to use Planck likelihood, you should specify the

following line:

** path['clik'] = /path/to/plc/

which correspond to the folder with the src/ folder, setup.py, etc.

If you want to use a data folder different from the one present in the folder

you are executing the code, please also add:

** path['data'] = /path/to/the/other/data/

root = '/home/amlan'

path['cosmo'] = root+'/class' path['clik'] = root+'/planck/code/plc_3.0/plc-3.01/'

brinckmann commented 1 year ago

What happens if you open python and write import clik ?

Best, Thejs

Amlan1996 commented 1 year ago

amlan@nova ~]$ python Python 3.6.3 |Anaconda, Inc.| (default, Oct 13 2017, 12:02:49) [GCC 7.2.0] on linux Type "help", "copyright", "credits" or "license" for more information.

import clik

Its not showing any error

brinckmann commented 1 year ago

Okay so it appears clik is installed correctly, which is good, but also makes it more confusing.

I suspect the files/directory isn't being created correctly (like you point out, only a few files get created).

Make sure to run in a clean directory? I.e. create a new directory specifically for this run, that you haven't tried to run in previously. There's a chance the log.param from a serious run hasn't been created properly.
It is also good practice when running with mpi to make sure the log.param is created before a run, e.g. python montepython/MontePython.py run -p input/base2018TTTEEE.param -o chains/Planck_test -f 0 which will create the directory and the log.param and do only one step without making a jump before stopping. This avoids some issues relating to mpi runs, especially on clusters. Then you can do an mpi run, something like (or however you normally launch with mpi): mpirun -np 4 python montepython/MontePython.py run -p input/base2018TTTEEE.param -o chains/Planck_test -N 100000

One more thing to take care of is if you're running on a cluster you may need to source the planck installation in your job script as well, in case it doesn't inherit that information with your job submission.

Best, Thejs

Amlan1996 commented 1 year ago

I tried the way you said, and now it is working correctly and not showing any error regarding the likelihood. It is also producing all the chains as well. But it is showing some error in the class file, but I didn't edit anything in the class file.

Here is what the error looks like:

Error in Class: background_init(L:819) :condition (pba->shooting_failed == TRUE) is true; Shooting failed, try optimising input_get_guess(). Error message:

input_shooting(L:646) :error in input_find_root(&xzero, &fevals, ppr->tol_shooting_deltax_rel, &fzw, errmsg); =>input_find_root(L:963) :error in input_fzero_ridder(input_fzerofun_1d, x1, x2, tol_x_rel*(((fabs(x1))<(fabs(x2))) ? (fabs(x2)) : (fabs(x1)) ), pfzw, &f1, &f2, xzero, fevals, errmsg); =>input_fzero_ridder(L:1118) :error; root must be bracketed in zriddr.

Thanks for the help in creating the chains in montepython.

Best regards, Amlan

Amlan1996 commented 1 year ago

After 18 hrs of running the code, these are the errors I have got till now:

Error in Class: background_init(L:819) :condition (pba->shooting_failed == TRUE) is true; Shooting failed, try optimising input_get_guess(). Error message:

input_shooting(L:646) :error in input_find_root(&xzero, &fevals, ppr->tol_shooting_deltax_rel, &fzw, errmsg); =>input_find_root(L:963) :error in input_fzero_ridder(input_fzerofun_1d, x1, x2, tol_x_rel*(((fabs(x1))<(fabs(x2))) ? (fabs(x2)) : (fabs(x1)) ), pfzw, &f1, &f2, xzero, fevals, errmsg); =>input_fzero_ridder(L:1118) :error; root must be bracketed in zriddr.

Error in Class: background_init(L:819) :condition (pba->shooting_failed == TRUE) is true; Shooting failed, try optimising input_get_guess(). Error message:

input_shooting(L:646) :error in input_find_root(&xzero, &fevals, ppr->tol_shooting_deltax_rel, &fzw, errmsg); =>input_find_root(L:963) :error in input_fzero_ridder(input_fzerofun_1d, x1, x2, tol_x_rel*(((fabs(x1))<(fabs(x2))) ? (fabs(x2)) : (fabs(x1)) ), pfzw, &f1, &f2, xzero, fevals, errmsg); =>input_fzero_ridder(L:1118) :error; root must be bracketed in zriddr.

Error in Class: perturbations_init(L:1006) :error in perturbations_solve(ppr, pba, pth, ppt, index_md, index_ic, index_k, pppw[thread]); =>perturbations_solve(L:3221) :error in perturbations_find_approximation_switches(ppr, pba, pth, ppt, index_md, k, ppw, tau, ppt->tau_sampling[tau_actual_size-1], ppr->tol_tau_approx, interval_number, interval_number_of, interval_limit, interval_approx); =>perturbations_find_approximation_switches(L:3689) :error in perturbations_approximations(ppr, pba, pth, ppt, index_md, k, mid, ppw); =>perturbations_approximations(L:6190) :condition (tau_c < 0.) is true; tau_c = 1/kappa' should always be positive unless there is something wrong in the thermodynamics module. However you have here tau_c=-9.136055e+08 at z=5.035113e-02, conformal time=1.395659e+04 x_e=-2.875294e-03. (This could come from the interpolation of a too poorly sampled reionisation history?).

Error in Class: perturbations_init(L:1006) :error in perturbations_solve(ppr, pba, pth, ppt, index_md, index_ic, index_k, pppw[thread]); =>perturbations_solve(L:3296) :error in perturbations_vector_init(ppr, pba, pth, ppt, index_md, index_ic, k, interval_limit[index_interval], ppw, previous_approx); =>perturbations_vector_init(L:4386) :condition (ppw->approx[ppw->index_ap_tca] == (int)tca_off) is true; scalar initial conditions assume tight-coupling approximation turned on

Error in Class: background_init(L:819) :condition (pba->shooting_failed == TRUE) is true; Shooting failed, try optimising input_get_guess(). Error message:

input_shooting(L:646) :error in input_find_root(&xzero, &fevals, ppr->tol_shooting_deltax_rel, &fzw, errmsg); =>input_find_root(L:963) :error in input_fzero_ridder(input_fzerofun_1d, x1, x2, tol_x_rel*(((fabs(x1))<(fabs(x2))) ? (fabs(x2)) : (fabs(x1)) ), pfzw, &f1, &f2, xzero, fevals, errmsg); =>input_fzero_ridder(L:1118) :error; root must be bracketed in zriddr.

I am trying to understand why I am getting errors in class because I have not modified anything in the class, and also when I run the class separately, these errors don't show up. These errors don't show up immediately after I run the command:

python montepython/MontePython.py run -p input/base2018TTTEEE.param -o chains/Planck_test -f 0

It shows only after running several amounts of time (almost 16 or 18 hrs). It would be great if you help with it .

brinckmann commented 1 year ago

Sometimes the shooting in class returns these errors when sampling tau_reio as there's some instability in the shooting method, although I'm a little surprised it's happening for Planck-only. You can try to sample z_reio instead (maybe putting some bounds on the parameter [2nd and 3rd numbers], e.g. 3 and 15, roughly corresponding to 5 sigma for Planck, and making sure to change the proposal width [4th number]) and include tau_reio as a derived parameter since going from z_reio to tau_reio is easier than vice versa: data.parameters['z_reio'] = [ 7.0, 3.0, 15.0, 1.0, 1, 'cosmo'] This parameter now has to be grouped with the other cosmo parameters. data.parameters['tau_reio'] = [1, None, None, 0, 1, 'derived'] This parameter now has to be grouped with the other derived parameters.

Hopefully the other errors also go away with this change.

Best, Thejs

Amlan1996 commented 1 year ago

Ok, I will check and let you know.

Best, Amlan

Amlan1996 commented 1 year ago

Hi Thejs,

It worked. The code ran successfully, and I got no errors.

Really thanks a lot for your continued help.

Best, Amlan

brinckmann / montepython_public

Configuration Error: No information on Planck_highl_TTTEEE likelihood was found #309

.bashrc

Source global definitions

User specific environment

if ! [[ "$PATH" =~ "$HOME/.local/bin:$HOME/bin:" ]]

then

PATH="$HOME/.local/bin:$HOME/bin:$PATH"

fi

export PATH

Uncomment the following line if you don't like systemctl's auto-paging feature:

export SYSTEMD_PAGER=

User specific aliases and functions

module load ohpc

module load intel/2019

module load cfitsio/3.49

Move this file to default.conf, and adapt it to your needs

Fill in the relevant path to your personal distribution.

If you create a new file out of this one, please remember to call

MontePython.py with the option '--conf my.conf'

At minimum, this file should contain one line:

** path['cosmo'] = path/to/your/class

Note, if you are using a modified version of class, be sure that the

path contains the word class, otherwise the code might not recognise

it.

If you want to use Planck likelihood, you should specify the

following line:

** path['clik'] = /path/to/plc/

which correspond to the folder with the src/ folder, setup.py, etc.

If you want to use a data folder different from the one present in the folder

you are executing the code, please also add:

** path['data'] = /path/to/the/other/data/