johannesulf / nautilus

Neural Network-Boosted Importance Nested Sampling for Bayesian Statistics
https://nautilus-sampler.readthedocs.io
MIT License
73 stars 8 forks source link

splitting ellipsoids sometimes stops prematurely #28

Closed johannesulf closed 1 year ago

johannesulf commented 1 year ago

Nautilus relies on multi-ellipsoidal decomposition to propose new points. This is done by first putting an ellipsoid around the live points and then splitting ellipsoids further if the volume of the ellipsoid union is too large. To perform the split, nautilus takes the largest ellipsoid, assigns its members to two groups based on Gaussian mixture modeling (GMM), i.e. sklearn.mixture.GaussianMixture, puts ellipsoids around the two groups and removes the original ellipsoid. Currently, the splitting of ellipsoids may stop prematurely. That's because, in the current implementation, nautilus rejects a split if one of the groups has less than n_points_min members. In rare instances, i.e., the scenario described in #26, this may result in no ellipsoid divisions, and one ends up with a single very large ellipsoid that's many orders of magnitude larger than the live set volume.

To prevent this, nautilus must amend the GMM by putting points into the smaller group based on the per-Gaussian probability coming out of the GMM. Something similar was previously done when nautilus relied on K-Means instead of GMM.