SheffieldML / GPy

Gaussian processes framework in python
BSD 3-Clause "New" or "Revised" License
2.03k stars 560 forks source link

EP marginal likelihood calculations are off #296

Closed slinderman closed 8 years ago

slinderman commented 8 years ago

Hi, I've been playing with marginal likelihood estimation techniques for GP classification, and I've found some strange behavior with the EP calculations. I wrote my own annealed importance sampling code and my results line up very nicely with the marginal likelihood estimate from GPy's Laplace approximation, but the estimates using EP are way off. The figure below shows this for a simple 1D example where the true length scale is 1.0. I fit GPs with length scales of [0.1, 0.5, 1.0, 2.0, 5.0] using either Laplace or EP, and then I plotted the marginal likelihood estimate. The EP is off by 1000 nats, and it doesn't seem to peak at 1.0.

Here's a gist that reproduces the behavior for Laplace and EP: https://gist.github.com/slinderman/413090470c00e44f688f

I poked around in the code and found a TODO suggesting this might have a simple fix, but without fully understanding that code, it's hard for me to say for sure. https://github.com/SheffieldML/GPy/blob/master/GPy/inference/latent_function_inference/expectation_propagation.py#L55

Thanks, Scott

ep_log_lkhd

mzwiessele commented 8 years ago

Hi, @asaul, has this been fixed in 0.9? Could you try to use the devel branch for your experiment? I seem to remember someone fixing it in the past few.

Am 23.01.2016 um 23:31 schrieb Scott Linderman notifications@github.com:

Hi, I've been playing with marginal likelihood estimation techniques for GP classification, and I've found some strange behavior with the EP calculations. I wrote my own annealed importance sampling code and my results line up very nicely with the marginal likelihood estimate from GPy's Laplace approximation, but the estimates using EP are way off. The figure below shows this for a simple 1D example where the true length scale is 1.0. I fit GPs with length scales of [0.1, 0.5, 1.0, 2.0, 5.0] using either Laplace or EP, and then I plotted the marginal likelihood estimate. The EP is off by 1000 nats, and it doesn't seem to peak at 1.0.

Here's a gist that reproduces the behavior for Laplace and EP: https://gist.github.com/slinderman/413090470c00e44f688f

I poked around in the code and found a TODO suggesting this might have a simple fix, but without fully understanding that code, it's hard for me to say for sure. https://github.com/SheffieldML/GPy/blob/master/GPy/inference/latent_function_inference/expectation_propagation.py#L55

Thanks, Scott

— Reply to this email directly or view it on GitHub.

lawrennd commented 8 years ago

Just tagging in @ric70x7 as this was also something he looked at. At one stage there was a term missing.

On Sun, Jan 24, 2016 at 8:17 AM, Max Zwiessele notifications@github.com wrote:

Hi, @asaul, has this been fixed in 0.9? Could you try to use the devel branch for your experiment? I seem to remember someone fixing it in the past few.

Am 23.01.2016 um 23:31 schrieb Scott Linderman <notifications@github.com :

Hi, I've been playing with marginal likelihood estimation techniques for GP classification, and I've found some strange behavior with the EP calculations. I wrote my own annealed importance sampling code and my results line up very nicely with the marginal likelihood estimate from GPy's Laplace approximation, but the estimates using EP are way off. The figure below shows this for a simple 1D example where the true length scale is 1.0. I fit GPs with length scales of [0.1, 0.5, 1.0, 2.0, 5.0] using either Laplace or EP, and then I plotted the marginal likelihood estimate. The EP is off by 1000 nats, and it doesn't seem to peak at 1.0.

Here's a gist that reproduces the behavior for Laplace and EP: https://gist.github.com/slinderman/413090470c00e44f688f

I poked around in the code and found a TODO suggesting this might have a simple fix, but without fully understanding that code, it's hard for me to say for sure.

https://github.com/SheffieldML/GPy/blob/master/GPy/inference/latent_function_inference/expectation_propagation.py#L55

Thanks, Scott

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/SheffieldML/GPy/issues/296#issuecomment-174266425.

slinderman commented 8 years ago

Sorry, I should have checked the devel branch first! Sure enough, the issue has been been fixed there. I've attached the updated plot for completeness. Thanks for your help! Scott ep_log_lkhd

lawrennd commented 8 years ago

No problem Scott. Would you consider integrating your AIS code? That would be a great addition!

Neil

On Sun, Jan 24, 2016 at 1:29 PM, Scott Linderman notifications@github.com wrote:

Sorry, I should have checked the devel branch first! Sure enough, the issue has been been fixed there. I've attached the updated plot for completeness. Thanks for your help! Scott [image: ep_log_lkhd] https://cloud.githubusercontent.com/assets/5632040/12536473/84335548-c274-11e5-8627-a48d351d9d82.png

— Reply to this email directly or view it on GitHub https://github.com/SheffieldML/GPy/issues/296#issuecomment-174298674.

slinderman commented 8 years ago

Certainly, as soon as I work out a few remaining kinks! I'm using an augmentation-based sampling scheme that might also be of interest. Hopefully I'll be able to finish this for the ICML deadline. I'll keep you posted!

lawrennd commented 8 years ago

Excellent, thanks Scott!

On Sun, Jan 24, 2016 at 1:47 PM, Scott Linderman notifications@github.com wrote:

Certainly, as soon as I work out a few remaining kinks! I'm using an augmentation-based sampling scheme that might also be of interest. Hopefully I'll be able to finish this for the ICML deadline. I'll keep you posted!

— Reply to this email directly or view it on GitHub https://github.com/SheffieldML/GPy/issues/296#issuecomment-174299488.