JohannesBuchner / UltraNest

Fit and compare complex models reliably and rapidly. Advanced nested sampling.
https://johannesbuchner.github.io/UltraNest/
Other
142 stars 30 forks source link

Issue when running on SLURM cluster #76

Open rfranceschi opened 1 year ago

rfranceschi commented 1 year ago

Description

Running UltraNest on a remote SLURM cluster on multiple nodes using Intel MPI 2019.9.

What I Did

Previously my code has been running without issues. There have been no changes in the fitting part of my code, but now it fails with the following message:

File ".../fitters.py", line 437, in fit
    for i, result in enumerate(self.sampler.run_iter(**self.run_kwargs)):
File ".../lib/python3.8/site-packages/ultranest/integrator.py", line 2579, in run_iter
    dlogz_min_num_live_points, (Llo_KL, Lhi_KL), (Llo_ess, Lhi_ess) = self._find_strategy(
  File ".../lib/python3.8/site-packages/ultranest/integrator.py", line 1545, in _find_strategy
    widthratio = 1 - np.exp(logweights[1:,0] - logweights[:-1,0])

This is how the sampler is initialized, nothing special happening here:

self.sampler = ultranest.ReactiveNestedSampler(
            [str(param) for param in self.parameters],
            lnprob,
            self.transform,
            log_dir=self.log_dir,
            resume=self.resume,
            storage_backend=self.storage_backend,
            **self.fitter_kwargs
        )

This is not happening on my local machine and I suspect this may be due to a change in the cluster. Could this be the case, or could this be an issue within UltraNest or my code?

Thank you in advance!

JohannesBuchner commented 1 year ago

Please post the entire error message and the debug.log

Please also post your entire arguments: storage_backend, fitter_kwargs

Perhaps there is a rounding issue, if the sampled points receive strange weights.

You could try putting a print there to see what logweights and widthratio are

rfranceschi commented 1 year ago

Apologies, I forgot the last line of the error message:

File ".../fitters.py", line 437, in fit
    for i, result in enumerate(self.sampler.run_iter(**self.run_kwargs)):
File ".../lib/python3.8/site-packages/ultranest/integrator.py", line 2579, in run_iter
    dlogz_min_num_live_points, (Llo_KL, Lhi_KL), (Llo_ess, Lhi_ess) = self._find_strategy(
  File ".../lib/python3.8/site-packages/ultranest/integrator.py", line 1545, in _find_strategy
    widthratio = 1 - np.exp(logweights[1:,0] - logweights[:-1,0])
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

Here are the parameters:

storage_backend: Literal['hdf5', 'csv', 'tsv'] = 'hdf5'
run_kwargs={'dlogz': 1, 'dKL': 1, 'frac_remain': 0.5, 'Lepsilon': 0.01, 'min_num_live_points': 100}

I am attaching two files with the values of logweights and widthratio, and the debug logger.

debug.log logweights.txt widthratio.txt

JohannesBuchner commented 1 year ago

That's odd, logweights is a 1-dimensional array instead of two-dimensional.

JohannesBuchner commented 1 year ago

Could you check if changing the line https://github.com/JohannesBuchner/UltraNest/blob/v3.3.3/ultranest/integrator.py#L1543

logweights = np.array(main_iterator.logweights[:itmax])

to

logweights = np.array(main_iterator.logweights[:itmax,:])

fixes this issue?