johannesulf / nautilus

Neural Network-Boosted Importance Nested Sampling for Bayesian Statistics
https://nautilus-sampler.readthedocs.io
MIT License
73 stars 8 forks source link

Maximum number of iterations in sampler.run() #23

Closed Jammy2211 closed 9 months ago

Jammy2211 commented 1 year ago

Great software, seeing a nice performance boost from Nautilus compared to Dynesty with my first set of profiling runs :)

Would it be possible to have an input which exits sampler.run(), for example if sampler.run(maxiter=10000) nautilus will exit after the 10000th sample?

The reason is that in our software (https://github.com/rhayes777/PyAutoFit) we exit the sampler every N iterations to output the latest results to hard-dis, including visualizing the maximum likelihood model. I don't see a way to do this with Nautilus currently.

johannesulf commented 1 year ago

Yes, that's a great suggestion. Being able to run the sampler for a bit and analyze intermediate results would be a good feature that I want to implement eventually. The whole fitting process is currently set up to run automatically to the end. While intermediate results are saved to the disk, these are not made available to the user.

Given how the algorithm works internally, stopping the fitting after exactly N samples is a bit tricky right now. I should be able to implement it eventually, though. The only thing that I would probably take into account is the batch size (100 by default) such that the sampler would stop after the smallest multiple of the batch size that's equal or larger than N. @Jammy2211, you suggest the interface sampler.run(maxiter=10000). In this example, would the sampler run until 10,000 iterations in total, i.e., not run anymore if it already ran for 10,000 iterations? Or would it run an additional 10,000 iterations, regardless of how many it has computed already?

Jammy2211 commented 1 year ago

you suggest the interface sampler.run(maxiter=10000). In this example, would the sampler run until 10,000 iterations in total, i.e., not run anymore if it already ran for 10,000 iterations? Or would it run an additional 10,000 iterations, regardless of how many it has computed already?

I have seen different sampling software implement both.

I prefer it never exceeding the number of iterations, so if you ran it again it would not run anymore (but it would continue running if you increased maxiter to 20000). But I think its really personal preference and depends on exactly what task you are doing.

I could use either implementation fine.

johannesulf commented 9 months ago

Thanks! I will have this implemented via your preferred method in the coming days.