JohannesBuchner / UltraNest

Fit and compare complex models reliably and rapidly. Advanced nested sampling.
https://johannesbuchner.github.io/UltraNest/
Other
147 stars 30 forks source link

Recommendations on running ultranest to solve multi-modal probelm #142

Open maudegull opened 4 months ago

maudegull commented 4 months ago

Hi Johannes,

Not so much of a bug as to looking for a little bit of advice. I hope that it's okay. This is my first time using nested sampling as opposed to MCMC and I have one/two questions. I am dealing with a problem that has a multi-modal distribution and is 8 or 9 dimensional (depending on the model used to fit to the data).

My most basic setup follows this example https://johannesbuchner.github.io/UltraNest/example-sine-highd.html with a couple of changes. I'd really appreciate any comments on these as well.

  1. I increased the number of live points to have a higher resolution (avoid missing out on peaks). I tried min_num_live_points =1000, which for my test case caught the highest mode (and I am trying a min_num_live_points=5000 one too, though it's probably excessive).
  2. Also set additionally set frac_remain=1e-4, (though I think 1e-5 could potentially improve peak detection)
  3. cluster_num_live_points=100, to my understanding that should help with having enough live points per detected mode.
  4. min_ess=1000 to have enough valid samples in my peaks

The most tricky part is to my understanding the number of steps. With the setup I have if this were a mono-modal distribution 2xndim steps would be enough (in theory; roughly basing this assumption on Fig 2 here https://arxiv.org/pdf/2211.09426 ). I was wondering if that still holds true (in general) for nested sampling

I have been running the calibrator and it's currently exploring 4xndim, however, a common message I am getting after each iteration is the following:

" step sampler diagnostic: jump distance 0.10 (should be >1), far enough fraction: 14.69% : very fishy. Double nsteps and see if fraction and lnZ change) "

For model A it is still exploring and logZ is getting better (though of the order ~0.2) , but I am curious to know if the jump distance and fraction actually can improve significantly for a multi-modal distribution. Can the jump distance basically get "stuck" in one of the sub modes and therefore yield really low jump distance, and therefore not reach the >1 threshold? And also is the fraction going to encounter the same issue as MCMC where the fraction just is low due to the nature of the problem? (i.e. lot of rejections when hitting a valley). If that is the case is there another diagnostic to use?

For the other model logZ is actually getting worse every time I increase the number of steps by 2. Is that something that can happen for multi-modal distributions or is that suggestive of a bug? I.e. was I really lucky in my first lowest nsteps run?

Lastly, so far even with the nsteps ~ 2xndim test runs, the runs are able to converge and I was able to recover really great fits to my data so I do think overall ultranest will be able to work for my problem. I am waiting to see how the calibrator route plays out, however, I am not sure that I can rely on the same diagnostics as a mono-modal problem. Is there any other tips/tricks or check for ultranest to improve handling a multi-modal distribution?

I'd really appreciate any further feedback!

JohannesBuchner commented 4 months ago

Hi,

Not so much of a bug as to looking for a little bit of advice. I hope that it's okay.

Sure!

This is my first time using nested sampling as opposed to MCMC and I have one/two questions. I am dealing with a problem that has a multi-modal distribution and is 8 or 9 dimensional (depending on the model used to fit to the data).

My most basic setup follows this example https://johannesbuchner.github.io/UltraNest/example-sine-highd.html with a couple of changes. I'd really appreciate any comments on these as well.

1. I increased the number of live points to have a higher resolution (avoid missing out on peaks). I tried min_num_live_points =1000, which for my test case caught the highest mode (and I am trying a min_num_live_points=5000 one too, though it's probably excessive).

Did you have an issue not finding the mode with a lower number of live points?

2. Also set additionally set frac_remain=1e-4, (though I think 1e-5 could potentially improve peak detection)

This only influences when the run is terminated. So if there is a peak near the apparent maximum likelihood as in a spike-and-slab problem.

3. cluster_num_live_points=100, to my understanding that should help with having enough live points per detected mode.

That should be fine, do you see the sampler discovering a number of distinct modes and then increasing the number of live points during a run?

4. min_ess=1000 to have enough valid samples in my peaks

OK, but possibly not necessary if you already increased the number of live points to 1000 or higher.

The most tricky part is to my understanding the number of steps. With the setup I have if this were a mono-modal distribution 2xndim steps would be enough (in theory; roughly basing this assumption on Fig 2 here https://arxiv.org/pdf/2211.09426 ). I was wondering if that still holds true (in general) for nested sampling

I have been running the calibrator and it's currently exploring 4xndim, however, a common message I am getting after each iteration is the following:

" step sampler diagnostic: jump distance 0.10 (should be >1), far enough fraction: 14.69% : very fishy. Double nsteps and see if fraction and lnZ change) "

For model A it is still exploring and logZ is getting better (though of the order ~0.2) , but I am curious to know if the jump distance and fraction actually can improve significantly for a multi-modal distribution. Can the jump distance basically get "stuck" in one of the sub modes and therefore yield really low jump distance, and therefore not reach the >1 threshold? And also is the fraction going to encounter the same issue as MCMC where the fraction just is low due to the nature of the problem? (i.e. lot of rejections when hitting a valley). If that is the case is there another diagnostic to use?

For the other model logZ is actually getting worse every time I increase the number of steps by 2. Is that something that can happen for multi-modal distributions or is that suggestive of a bug? I.e. was I really lucky in my first lowest nsteps run?

Lastly, so far even with the nsteps ~ 2xndim test runs, the runs are able to converge and I was able to recover really great fits to my data so I do think overall ultranest will be able to work for my problem. I am waiting to see how the calibrator route plays out, however, I am not sure that I can rely on the same diagnostics as a mono-modal problem. Is there any other tips/tricks or check for ultranest to improve handling a multi-modal distribution?

I'd really appreciate any further feedback!

Not sure what "logZ is getting worse" means, but if it decreases or increases, yes, this is a common sign that the number of steps is still not sufficient (and the calibrator checks for that too).

There are some comments in the discussion of https://arxiv.org/abs/2402.11936 on whether RJD always works, and some test cases with multimodal distributions.

maudegull commented 4 months ago

Hi Johannes,

Thanks for the swift response!

  1. So I only was able to find the highest mode with less live points, though it was also with very few nsteps. Based on my comparison between 1000 and 5000, nsteps seems to have more influence on whether it explores all modes. I'm guessing that this implies I could possibly do it with 400 live points. What I know about my multi-modal distribution is that it is one parameter in particular driving it and the peaks and valleys of the distribution of this parameter are very close to one another, hence why I originally thought that increasing the live points was necessary. But the separation of the modes is of the order ~0.01 in the parameter that drives the multimodality (in "cube" space), so that should be captured with the resolution of 400 live points (I'm assuming it has an initial resolution of 1/400?) .

  2. So what I see happening is that during the run it will find multiple modes (can see them in the live points plot), however, towards the end it will have switched back to mono-modal. However, when I plot the results in the corner plot I will see the different modes. I am assuming what happens is that it will start exploring only the best mode? Based on the printed statement N=1000 all the time. So it may not be increasing the number of live points overall.

  3. You're right that I clear the min_ESS pretty easily.

Thanks for the reference, seems like I can defs improve the RJD at least a little bit, tough potentially not clear 1.

Thanks a lot for the comments that was helpful!

JohannesBuchner commented 4 months ago
  1. ok
  2. maybe above some likelihood threshold, there is only one mode left.

At least this paper argues that the product of number of live points and nsteps is important https://arxiv.org/abs/2308.05816