Open Hephaistos96 opened 4 years ago
@Hephaistos96 I'm like 99% confident this is just a mismatch between optimizer hyperparameter settings. Looking at the ones you use, a batch size of 25 is pretty small, so that might be the culprit? Maybe you can find the hypers that are sort of hidden by the .optimize
call?
In my experience, 150 samples is pretty crazily high (e.g., you probably don't need your predictive distributions to in practice be represented by a mixture of 150 Gaussians). It's probably better on real data to swap that out for having just a few more inducing points.
Good morning,
Context & Issue
I am currently in the process of discovering DeepGPs as part of a school project.
I have already tried to use a library created by the Sheffield ML group, deepgp, that generates pretty good and consistent results but whose computation time gets particularly long with training sets containing only a few thousand points (code scalability is a major constraint of my project).
As a consequence, I have decided to explore new approaches and in particular GPyTorch Deep GPs. I have followed the implementation tutorial available on the documentation website and I indeed got a significant improvement in time performance. However, I haven't been able to get the same prediction accuracy as what I had with the other library.
Here is an illustration of my issue (full code at the end of the post) I designed a function containing multiple "steps", picked randomly a few points in the interval [-2 ; 2] and ran both deepgp and GPyTorch DeepGPs to generate a mean prediction on the same interval. As you can see, it seems that GPyTorch produces a result that is a lot "noiser":
GPyTorch
deepgp
I am not sure of what is causing that issue. Since GPyTorch Deep GPs offer you to control a lot of different parameters (deep gp layers structure, number of inducing points, number of samples, batch size, etc.) that deepgp manages on its own, I thought that my GPyTorch DeepGP could be parameterized incorrectly. I have tried a lot of different parameters combinations but nothing really worked.
I would therefore appreciate to get your opinion on the potential source of my issue.
Question
Am I right thinking that my model parameters aren't set perfectly or do you think that my issue has another explanation? In the first case, do you know where could I find some documentation about the influence of each parameter and the way to choose them as wisely as possible?
Additional notes, questions
My project actually consists in 3D regressions on training sets that look like this:
I have tried to run GPyTorch Deep GPs on this kind of example and the result is a lot worse to what I get in the 2D case: I get constant prediction on the whole prediction surface. On simpler 3D functions (x² + y² for instance) and with a training set consisting in regularly spaced points, I get a good result though.
Code (2D example)
deepgp module
GPyTorch Deep GPs