SheffieldML / GPy

Gaussian processes framework in python
BSD 3-Clause "New" or "Revised" License
2.03k stars 561 forks source link

how can I use the hmc method to approximate the non gaussian likelihood, thank you #554

Open lk1983823 opened 7 years ago

lk1983823 commented 7 years ago

I set the Possion distribution pdf as my GP likelihood, and want to use hmc method to infer its parameter. Here is my code as follows:

poisson_likelihood = GPy.likelihoods.Poisson()
kernel = GPy.kern.RBF(input_dim, variance=1.0, lengthscale=1.0)
hmc_inf= GPy.inference.mcmc.HMC()
m = GPy.core.GP(X=train_X_scaled, Y=train_Y_dl_total_scaled, likelihood=poisson_likelihood, inference_method=hmc_inf, kernel= kernel)

the error shows:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-97-6f996d532af2> in <module>()
      3 poisson_likelihood = GPy.likelihoods.Poisson()
      4 kernel = GPy.kern.RBF(input_dim, variance=1.0, lengthscale=1.0)
----> 5 hmc_inf= GPy.inference.mcmc.HMC()
      6 m = GPy.core.GP(X=train_X_scaled, Y=train_Y_dl_total_scaled, likelihood=poisson_likelihood, inference_method=hmc_inf, kernel= kernel)
      7 

TypeError: __init__() missing 1 required positional argument: 'model'

So I want to know is there anyway that I can use the hmc to infer the parameters of an arbitrary likelihood? Thank you for your help!!!

mu2013 commented 7 years ago

I think there is an example notebook about hmc in GPy. Zhenwen should know more about the notebook.

On 24 September 2017 at 01:23, Kuan Lu(Frank) notifications@github.com wrote:

Reopened #554 https://github.com/SheffieldML/GPy/issues/554.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/SheffieldML/GPy/issues/554#event-1262502026, or mute the thread https://github.com/notifications/unsubscribe-auth/AE4H0Ln1gcEhSDprc48nZU9z4AwYqj07ks5slaEdgaJpZM4PhucU .

lk1983823 commented 7 years ago

@mu2013 : Do you mean this "http://nbviewer.jupyter.org/github/SheffieldML/notebook/blob/master/GPy/sampling_hmc.ipynb"? I know that tutorial, but I find it impossible to set the likelihood arbitrarily.

zhenwendai commented 7 years ago

How would you like to handle the intractable integral of the non-Gaussian likelihood?

You can run HMC for a non-Gaussian likelihood GP with Laplace approximation, but the samples are biased because of the approximation. (This can be done with GPy.)

Alternatively, you can run HMC without marginalizing the output variable of GP prior, f, but it results in very high HMC sampling problem. (This is not implemented.)

lk1983823 commented 7 years ago

@zhenwendai Thank you for your reply. I am a new learner for GP. I can't catch all you have said above. For example "run HMC for a non-Gaussian likelihood GP with Laplace approximation". The reason I ask this question is that I find it possible in GPflow to use GPMC to make mcmc inference "http://gpflow.readthedocs.io/en/latest/notebooks/mcmc.html" , where I can set the likelihood arbitrarily. I just wander why I can't do this using GPy.

zhenwendai commented 7 years ago

According the link that you provided, GPFlow does the second way that I mentioned previously. Its HMC sampler draw samples for f and model parameters jointly, which is typically super high dimensional.

Unfortunately, GPy does not support it at the moment, because we by default focus on the model with the latent function f marginalized out (apprioximately).

mu2013 commented 7 years ago

HMC require the gradient of likelihood. But if you use mcmc with metroplis hasting you do not need it. For non-Gaussian likelihood case, what we can do is similar to Williams, C. K. and D. Barber (1998). Bayesian classification with gaussian processes.

you can check out that paper for some details.

cheers

On 27 September 2017 at 21:41, Zhenwen Dai notifications@github.com wrote:

According the link that you provided, GPFlow does the second way that I mentioned previously. Its HMC sampler draw samples for f and model parameters jointly, which is typically super high dimensional.

Unfortunately, GPy does not support it at the moment, because we by default focus on the model with the latent function f marginalized out (apprioximately).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/SheffieldML/GPy/issues/554#issuecomment-332648710, or mute the thread https://github.com/notifications/unsubscribe-auth/AE4H0JGjqBaar6qYnLyNsoIqQBOSWJsYks5smrL4gaJpZM4PhucU .