Defining Prior for a sharply peaked likelihood

JohannesBuchner / PyMultiNest

Pythonic Bayesian inference and visualization for the MultiNest Nested Sampling Algorithm and PyCuba's cubature algorithms.

http://johannesbuchner.github.io/PyMultiNest/

Other

194 stars 88 forks source link

Defining Prior for a sharply peaked likelihood #190

Closed GhoshTathagata closed 2 years ago

GhoshTathagata commented 3 years ago

I am currently facing an issue when defining prior for a sharply peaked likelihood. The likelihood can be considered as Gaussian distribution of standard deviation is in the order of 0.0005. If I define a uniform prior within a suitable range, the likelihood comes to zero most of the time as the prior value should have much precision, i.e., precise up to a large number of the decimal places. Is there any way to incorporate the precision while defining prior? I illustrate the problem with 1 parameter. However, I am handling 4 parameters among which marginalized distributions of 2 parameters are highly peaked. Any suggestion to speed up the calculation is helpful. Thanks.

JohannesBuchner commented 3 years ago

If your prior is extremely large compared to the posterior, perhaps it is not defined very well. If you simulate a few data sets drawn randomly from the prior, do they look like reasonable a-priori examples?

If you want have wide support while concentrating the prior towards some location, you could use a Gaussian or student-t prior.

I am not sure I understood -- are you saying the problem is floating point numerical precision, such that the likelihood peaks at 1e10 +- 0.0005 for example?

GhoshTathagata commented 3 years ago

The gaussian or student t-distribution is somehow informative. I am dealing with 4-dimensional likelihood. I attach a corner plot of 4 parameters for your reference. The likelihood most of the time comes to zero, which makes the code slow.

parameter_space

JohannesBuchner commented 3 years ago

I am not sure I understand what your question is exactly.

JohannesBuchner commented 3 years ago

Your function is computing a log-likelihood, right?

GhoshTathagata commented 3 years ago

My problem is a little bit different. But currently, I am exploring how to speed up the calculation. Please consider the following. However, the likelihood function remains the same as discussed below.

Suppose I call uniform prior over x1, x2, x3, and x4 to construct P(x1, x2, x3, x4) by calculating the likelihood L(x1, x2, x3, x4) (the attached corner plot) in order to check the computation time. Basically, P(x1, x2, x3, x4) and L(x1, x2, x3,x4) should be same. Is there any suggestion to speed up the calculation (like define prior in a particular way or anything else) except the parallel computation?

Yes, I first calculate L(x1,x2,x3,x4) using 4d gaussian KDE (from 4d joint distribution) and then take the logarithm.

JohannesBuchner commented 3 years ago

It looks like if you reparametrize x3 to x3'=x4+x3 or similar, you could approximate what you need with correlated 2d Gaussian for (x1,x2) approximation, a Gaussian on x3' and a uniform distribution on x4.

If you are trying to reuse a posterior from another run, it may be easier to fit jointly (adding the first problem in the loglikelihood).

jvictor42 commented 2 years ago

The gaussian or student t-distribution is somehow informative. I am dealing with 4-dimensional likelihood. I attach a corner plot of 4 parameters for your reference. The likelihood most of the time comes to zero, which makes the code slow.

Interesting. Do you have this MCMC code and plot on github?