SheffieldML / GPy

Gaussian processes framework in python
BSD 3-Clause "New" or "Revised" License
2.04k stars 562 forks source link

how to train a multi-task GP under GPy framework #601

Closed shalijiang closed 6 years ago

shalijiang commented 6 years ago

with a single dataset (X, Y), we can optimize the model as follows: m = GPy.models.GPRegression(X, Y, kernel=kernel, mean_function=mean_function) m.optimize(messages=True, max_f_eval = 1000)

what if we want to optimize the model by minimizing the sum of negative log likelihood of multiple datasets (say the simplest case: independent multitask). Is there a way to do that with GPy?

mzwiessele commented 6 years ago

Yes, just provide GPy with a multidimensional dataset Y \in R^(N, D), where D is the number of tasks, and N are the number of smples/datapoints you have.

On 14. Feb 2018, at 23:47, Shali Jiang notifications@github.com wrote:

with a single dataset (X, Y), we can optimize the model as follows: m = GPy.models.GPRegression(X, Y, kernel=kernel, mean_function=mean_function) m.optimize(messages=True, max_f_eval = 1000)

what if we want to optimize the model by minimizing the sum of negative log likelihood of multiple datasets (say the simplest case: independent multitask). Is there a way to do that with GPy?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

shalijiang commented 6 years ago

thanks. maybe I shouldn't say multitask. Actually what I want is: optimize the sum of likelihoods of several independent datasets (X_1, Y_1), (X_2, Y_2), ...., (X_k, Y_k)

mzwiessele commented 6 years ago

Yes you can just create a new model with GPs as parameters, but linking them ( The objective function will be the sum of the negative log-likelihoods (or just the objective functions of the GPs).

See https://github.com/sods/paramz/blob/master/tutorial/ParamzSimpleRosen.ipynb or https://github.com/sods/paramz/blob/master/paramz/examples/ridge_regression.py for how to create a model.

You won’t need to implement the parameters changed method, as the gradient of each GP is handled internally.

You can add as many GPs as you like and edit them according to your needs (kernels, likelihoods, inference etc.) as models are just parameters themselves.

On 15. Feb 2018, at 19:38, Shali Jiang notifications@github.com wrote:

thanks. maybe I shouldn't say multitask. Actually what I want is: optimize the sum of likelihoods of several independent datasets (X_1, Y_1), (X_2, Y_2), ...., (X_k, Y_k)

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or mute the thread.