joshspeagle / dynesty

Dynamic Nested Sampling package for computing Bayesian posteriors and evidences
https://dynesty.readthedocs.io/
MIT License
347 stars 76 forks source link

use gaussian mixture as prior #300

Closed doublestrong closed 3 years ago

doublestrong commented 3 years ago

hi, I want to use a Gaussian mixture as prior and defined prior_transform pushing a sample, u, in hypercube to it. I used the first dim of u to select a component from the mixture and then map the u to that Gaussian component. It works and I will show my result later.

My question is if my approach is correct in principle? Basically I cut the hypercube into several chunks and each chunk will be mapped to one of the Gaussian components. Will it cause troubles when shrinking likelihood contour in the hypercube to explore a new live point?

The dimension of my model is 6. I place a bimodal bi-variate Gaussian mixture on (x1, x2):

# Gaussian mixture prior for (x1, x2); covariance is shared.
weights = [.5, .5]
mu1 = np.array([0.0, 0.0])
mu2 = np.array([5.0, 0.0])
cov = np.array([[1.0, 0.0],
              [0.0, 0.00001]])

The other three factors in my model are implemented by conditional prior and loglikelihood: Conditional prior:

  1. translation from (x1, x2) to (x3, x4): (x3, x4) = (x1, x2) + N(mu = [20, 0], sigma = cov)
  2. translation from (x3, x4) to (x5, x6): (x5, x6) = (x3, x4) + N(mu = [0, 20], sigma = cov) Log-likelihood
  3. observed translation between from (x1, x2) to (x5, x6): (x5, x6) = (x1, x2) + N(mu = [20, 20], sigma = cov)

myplot3 myplot2 myplot1

joshspeagle commented 3 years ago

This should be okay in principle, although it might lead to some difficulty sampling right at the boundary where you transition between mixtures. I think an alternate parameterization that is a little bit more stable would be trying to compute the CDF of the mixture directly, since that is fundamentally continuous rather than some discrete partitions. Given the problem appears to be pretty low-dimensional though, I don't think matters too much.

segasai commented 3 years ago

My suggestion would be to define instead a single Gaussian prior G1(theta) where the mean/covariance are determined from the means/covars of your two matrices and then use the likelihood function which is L(theta) * G2(theta)/G1(theta) where G2(theta) is your two-gaussian prior.

doublestrong commented 3 years ago

This should be okay in principle, although it might lead to some difficulty sampling right at the boundary where you transition between mixtures. I think an alternate parameterization that is a little bit more stable would be trying to compute the CDF of the mixture directly, since that is fundamentally continuous rather than some discrete partitions. Given the problem appears to be pretty low-dimensional though, I don't think matters too much.

Thanks for the great answer! Just to make sure that I understand it correctly, by computing CDF (p=F(x)), do you mean try to find the inverse of F? I think GMM does not have a closed form of inverse of CDF so it must be a numerical solution to the inverse, right? Do you know any fast numerical way to do so? Many thanks!

doublestrong commented 3 years ago

My suggestion would be to define instead a single Gaussian prior G1(theta) where the mean/covariance are determined from the means/covars of your two matrices and then use the likelihood function which is L(theta) * G2(theta)/G1(theta) where G2(theta) is your two-gaussian prior.

Thanks for the great answer! This sounds good for the easiness of prior transformation. I may get back to here after some tests.

doublestrong commented 3 years ago

problem resolved! Thanks guys! I may get back to here after some tests.