GPflow / GPflow

Gaussian processes in TensorFlow
Apache License 2.0
1.85k stars 434 forks source link

Example that shows how to separate additive effects, e.g. time series decomposition of birthdays data #491

Closed cs224 closed 6 years ago

cs224 commented 7 years ago

Hello,

I am trying to replicate something like shown here: http://andrewgelman.com/2012/06/19/slick-time-series-decomposition-of-the-birthdays-data/ http://research.cs.aalto.fi/pml/software/gpstuff/demo_births.shtml

I currently cannot find out how after fitting a model I can use something like m.predict_y(xx) on only a part of the model? It would help if your set of examples would include something like this birthday demo.

Many thanks and best regards, Christian

mathDR commented 7 years ago

This was discussed in Issue #232.

I went ahead and implemented a solution. It fits the full model, then makes separate models for each component and sets the hyperparameters correctly. See the notebook here. Note you will need to download the birthday dataset here

cs224 commented 7 years ago

Thank you very much for your very useful solution! I have to look into it in more detail.

On first sight I have 2 questions: 1) If I would use a more complicated additive kernel it would help if I could refer to each component via a name or like in the GPstuff version via an index. I understand that the conversation in #234 was exactly around this point, e.g. how to offer the API to an end-user like me. I will continue my questions there then. 2) If I look at the graph generated by 'plot(m1, X, Y,N=1)', I cannot imagine that this is 'correct' as the delta between the mean line and the dots is too big and cannot be corrected by the small addition of the periodic component. The whole model on the other hand fits very well 'plot(m, X, Y,N=1)'!

I think this ticket #491 is still relevant and it would help others if there would be an example given in the documentation that shows how to separate additive effects.

Thanks a lot! Christian

cs224 commented 7 years ago

I've now used the insights I got from the example of @mathDR on a simpler data-set, the CO2 data of Mauna Loa as given by Gaussian Processes for Machine Learning in chapter 5.

I put my initial results in a jupyter notebook here. Use the nbviewer link for better visual display.

Either I am doing something wrong or I seem to have a major misunderstanding of what the components are supposed to mean. I was under the impression that the components should be additive, but my test-case does not look like addtive. I've described my confusion in more detail in the notebook.

I would be very grateful for any further hints.

I am willing to develop the CO2 and the birth date examples into full examples that could be used with the GPflow documentation if wanted.

cs224 commented 7 years ago

Please correct me if I am wrong, but as far as I can tell predicting the components of an additive kernel does not work by looking at each individual kernel.

Have a look at The Kernel Cookbook and scroll down to "Additive decomposition" or look at equation 1.17 at the kernels chapter of David Duvenaud's Phd thesis.

I believe this functionality would need to be part of the core feature set of GPflow, wouldn't it?

cs224 commented 7 years ago

Here is a blog post of some new development in PyMC3 that shows how to perform this additive decomposition using the new features: https://bwengals.github.io/looking-at-the-keeling-curve-with-gps-in-pymc3.html It would be great to have a similar blog post/example how to achieve this via GPflow.

cs224 commented 7 years ago

The simplistic idea on how to do the additive decomposition from above is not correct.

I did not find a way on how to make GPflow perform this additive decomposition and therefore I decided to implement it by hand. In that set-up GPflow is used to find the correct parameters and the hand written code is used to decompose the addtitive parts.

I've put the example code in a jupyter notebook here. Use the nbviewer link for better visual display.

mathDR commented 7 years ago

@cs224 Thanks for posting this! (I was in the midst of writing up the same notebook!!)

yunlinz commented 6 years ago

@cs224 This notebook is very helpful.

One thing I've found while playing with GPflow is that you can access the kernel functions, so you can replace your kernel matrix calculations using the built in Kernel.compute_K and Kernel.compute_K_symm functions which will have the trained parameters.

Another thing of note is that the noise you observe in the posteriors should not be real, and are probably result of ill-conditioning of inverting a matrix. If you replace np.linalg.inv(A).dot(y) with np.linalg.solve(A, y) it should give you better conditioned results.

cs224 commented 6 years ago

@yunlinz Thanks for your comments. I'll try that.