johannesulf / nautilus

Neural Network-Boosted Importance Nested Sampling for Bayesian Statistics
https://nautilus-sampler.readthedocs.io
MIT License
73 stars 8 forks source link

Adding Bounds being skipped for certain fits #37

Closed Jammy2211 closed 1 year ago

Jammy2211 commented 1 year ago

For the vast majority of Nautilus use-cases, the code runs brilliantly.

However, for a small subset, I am running into an issue where the adding of bounds is being skipped:

Adding Bound 49^MAdding Bound 49:  skipped
Filling Bound 48:    done
N_like:            192432
N_eff:                  1
log Z:          40185.058
log V:            -27.277
f_live:             1.000

Adding Bound 49^MAdding Bound 49:  skipped
Filling Bound 48:    done
N_like:            193192
N_eff:                  1
log Z:          40185.058
log V:            -27.277
f_live:             1.000

This leads to extremely long Nautilus run time and once the run is complete appears to cause some sort of infinite loop when the sampler.posterior() function is called.

This occurs for a small fraction of input datasets, indicating it is probably something perverse about their likelihood function.

I have uploaded a checkpoint file here: https://drive.google.com/file/d/16LU44iwQ_dckJjfn_5_NycWk6MgPV_GE/view?usp=sharing

Thank you!

johannesulf commented 1 year ago

@Jammy2211 Thanks for raising this issue. Some background on what's happening: when nautilus proposes a bound, it can happen that the newly proposed bound is larger in volume than the previous one. In this case, nautilus simply rejects the new bound and continues adding points to the old one. After some time, it'll try to build a new one again. Generally, bounds being skipped indicates that nautilus is having a hard time figuring out the high-likelihood region. This can be improved by, for example, increasing the number of live points.

So that bounds are skipped is nothing inherently problematic. But calling sampler.posterior() shouldn't result in any crash or freeze. I currently don't know why this would happen but I'll have a closer look and try the file you sent. Thanks so much for providing that.

Jammy2211 commented 1 year ago

Okay, I'll do some tests with a higher n_live.

The setup works fine for ~90% of datasets, so it must be that for specific datasets its having these issues. Which probably makes sense.

Jammy2211 commented 1 year ago

I have increased n_live from 75 to 450 but still get the skipped behaviour, so anticipate it is something specific about the likelihood function for these cases.

johannesulf commented 1 year ago

Interesting! It seems like it's only a 5-dimensional problem. So I would have expected 450 live points to work quite well. Still, it can happen that some boundaries are skipped once in a while. Does it happen with the same frequency for 450 as it does for 75?

I also tried reproducing the freeze when calling sampler.posterior() but was unable to do so. Also, I currently can't imagine why such a freeze would occur at all. Can you produce a freeze working only with the checkpoint file you sent me?

sampler = Sampler(prior, likelihood, 5, n_live=75, filepath='checkpoint.hdf5')
print(sampler.posterior())
Jammy2211 commented 1 year ago

I have had the issue crop up on another use case, this time in an N=3 dimensional parameter space. So it could well be that it is specific to lower dimensional problems.

The main problem is that nautilus seems to never converge when it happens. For the specific fit in question, for most datasets (when this issue doesn't crop up) the fit is completed after ~ 5000 iterations. For datasets where this skipped behavior occurs nautilus can run for >100000 iterations, even thought everything but the data I'm fitting is the same.

Does it happen with the same frequency for 450 as it does for 75?

It looks like it, and in both cases nautilus runs for 100000+ iterations as if its indefinitely "stuck".

I also tried reproducing the freeze when calling sampler.posterior() but was unable to do so. Also, I currently can't imagine why such a freeze would occur at all. Can you produce a freeze working only with the checkpoint file you sent me?

The freezes when calling sampler.posterior() may not be what happens. It was definitely getting stuck, but was difficult to test on the HPC where in the code is occured. I will try and reproduce it, but the crashing behavior could be associated with a slightly different part of the code.

johannesulf commented 1 year ago

Thanks for the input! I was checking the checkpoint file and I did notice that nautilus is having a hard time figuring out the likelihood. This also has to do with the fact that there are two distinct peaks that nautilus doesn't separate. The issue may be related to that. @Jammy2211 Can you send me the code for the 3-dimensional problem? Maybe I can find some ways to improve this behavior. nautilus shouldn't struggle with 3-dimensional likelihoods.

Jammy2211 commented 1 year ago

Apologies for the delay, hectic week.

Could you let me know if you can install the following library: https://pyautocti.readthedocs.io/en/latest/index.html

One dependency is hit or miss if it installs, and if its problematic I'll try come up with a simpler method. If the install is ok I can send a script.

johannesulf commented 1 year ago

Thanks! I installed the library with some workarounds. The script may or may not work already. I'll most likely be able to figure it out if it doesn't work already.

johannesulf commented 1 year ago

@Jammy2211 Let me know if you have the script. I'd be happy to work on it.

Jammy2211 commented 1 year ago

Once I tried to reproduce this issue, I no longer got the skipped behavior.

It occurs when the data I input into the analysis has defects / issues which mess up the likelihood function. I improved the data processing and the issue went away on this occasion.

For various different models we are still seeing this skipped behavior crop up now and then, with it often leading to nautilus running for days to complete a fit when it normally takes < 12 hours. I will put an issue up here with one once we have an example that seems appropriate for testing.