Updates for Pyro 1.0 release

fehiepsi commented 4 years ago

@tbrx pointed me to this nice usage of Pyro GP module! There would be some minor changes in Pyro 1.0 (which will be released soon), so I would like to point out these differences before it is finalized. I also want to take this chance to clarify some questions in this gist (I put all the comments in the Note sections). If you have any feedback, please let me know so I can incorporate them in time. :)

tbrx commented 4 years ago

@fehiepsi Thank you so much for taking the time go to through this and demonstrate the new usage for Pyro 1.0 and the use of PyroModule and PyroSample. I suppose we are now too late for feedback for 1.0, but I do have some follow-up questions about the interface regardless! I think these stem from being not entirely clear what is going on behind the hood for gp.Parameterized, relative to PyroModule, and how the guide "mode" works.

Is there a straightforward way to define explicit guides for the GP parameters, even if these are autoguides? For example, it would be great to be able to write something like guide = pyro.infer.autoguide.AutoMultivariateNormal(gpmodel), but then I don't know how to do inference. Is it possible to train a GP using the standard svi = SVI(model, guide, optimizer, loss) syntax, or something similar?
Is it possible for the mean_function argument to GPRegression to be a PyroModule, or does this have to be an instance of gp.Parameterized? It seems like perhaps the latter, in order to have hooks into .set_mode('guide') and .set_mode('model') — or at least, otherwise I am not quite sure how to incorporate both the guides for the mean function alongside the guides for the GP. I see how your implementation (with a class ParametricFn(gp.Parameterized):…) works, but it seems like then again we give up the ability to use autoguides. Explicitly defining PyroParam objects for each parameter is definitely easier to use than what we had before, but it still seems silly to do so when effectively it is just defining a AutoDiagonalNormal guide anyway.

I suppose the short version of this is: are there any best-practices we should be aware of when defining models which use a GP as part of a larger model, rather than just using a GP alone? In particular it seems tricky to combine the different guides.

Thanks again for your help!

fehiepsi commented 4 years ago

Hi @tbrx, thanks for the feedback and questions! I think you can use gpmodels with Pyro autoguide and SVI as any PyroModule instance: autoguide.AutoMultivariateNormal(gpmodel.model). Parameterized just does an additional job to define autoguide for each random parameter. And there is no difference between utilities gp.util.train and SVI except that gp.util.train requires model/guide to be methods (instead of functions) of a class and it uses PyTorch optimizer instead of Pyro optimizer. So if you don't want that additional feature, you can just use gpmodel.model.

Re mean function: you can define mean function to be a function, a nn.Module (e.g. CNN class in deep kernel learning example or nn.Linear in deep gp example), or a Parameterized (if you need the autoguide mechanism of Parameterized, which is both limited and flexible - e.g. in gplvm, I want to use autonormal only for latent X and the default autodelta for other random parameters.) You can also want to define it to be a PyroModule with some random parameters, but you would need to write separate guide for it.

I guess your main question is how to not use gpmodel.guide. Sure you can (it would be a bug if you can't - so please let me know if there are any issues). I believe that you can use gpmodel.model in a larger model and define your own guide if it is more flexible. Of course, you would need to write your own predict method to do prediction on new data using latent samples from your custom guide. Though, to be honest, I don't know which limitation it is when using gpmodel.guide. The ParametricModule class in the gist just shows you how to take advantage of mode to incorporate a hierarchy guide for the mean function's random parameters (if you want to define a hierarchy prior/guide for a random parameter in PyroModule, you would have to use the same lambda self: pattern). Without that, you would need to write a separate guide for mean function, use SVI to train pyro.infer.autoguide or custom guide, get samples from the guide, and replace it into some predict function... I also don't know how to combine pyro.infer.autoguide and a custom guide for mean function. :( Maybe I am missing something?

it still seems silly to do so when effectively it is just defining a AutoDiagonalNormal guide anyway

The original notebook constructs a hierarchy guide for mean function's parameters (to set some constraints for loc and scale params) so I just want to illustrate how to achieve that. If you want to use Diagonal Normal autoguide directly, just simply use (more explanation can be found in the note I put before cell 16)

class ParametricFn(gp.Parameterized):
    def __init__(self):
        super().__init__()
        self.alpha = pyro.nn.PyroSample(dist.Uniform(0, 20))
        self.beta = ...
        self.gamma = ...
        self.autoguide("alpha", dist.Normal)
        self.autoguide("beta", dist.Normal)
        self.autoguide("gamma", dist.Normal)

    def forward(self, X):
        return parametric_fn(X, self.alpha, self.beta, self.gamma)

tbrx commented 4 years ago

Thanks — these comments were actually quite helpful. I'm not sure what my issues were before, but I'm now able to use e.g. autoguide.AutoMultivariateNormal(gpmodel.model) successfully, even if using SVI(…) for inference.

I'll try to pull a trimmed-down version of that semiparametric model example together into a gist, and if I run into any further issues with the autoguides or with composing the GP with other models I'll let you know…

fehiepsi commented 4 years ago

Yeah, that is great to hear! Just let me know if you want to discuss further if issues come. ;)

tbrx commented 4 years ago

Okay, so although autoguides seem to work as they should at training time (using SVI), I see my problem before was maybe actually related to what you refer to here:

you would need to write a separate guide for mean function, use SVI to train pyro.infer.autoguide or custom guide, get samples from the guide, and replace it into some predict function...

Yesterday I had been mistaking my model as not training, when actually it was instead sampling from the prior instead of the guide for the mean function. It would be nice if there were some way to do, e.g.,

predictive = pyro.infer.Predictive(gpmodel, guide=guide, num_samples=1, return_sites=('_RETURN',))

At the moment, regarding guides, I was hoping to not have to think about guides too much at this stage and focus on just writing model code and using some auto-Gaussian variant. In that regard, the following gist basically does what I'm looking for:

https://gist.github.com/tbrx/a12e244a4424d94fbba744cfbd188e9c

Prediction can actually be handled easily with pyro.infer.Predictive, as well.

It does mean that instead of using the mean_function= argument for the GP, I am explicitly calling .set_data to update the residual on each .model call. That seems a bit of a hack, but will be okay for us for now.

fehiepsi commented 4 years ago

Yeah, the gist shows what we should do with autoguide and SVI. It is nice to see that things work as expected. :D

predictive = pyro.infer.Predictive(gpmodel, guide=guide, num_samples=1, return_sites=('_RETURN',))

Does gpmodel mean semiparametric or semiparametric.gpmodel? AFAIK, it is best to not use _RETURN if multiple values are returned during predictions. It is better to write sample statements pyro.sample("z", dist.Delta(z), obs=z) to record the values that you want. We keep it _RETURN for backward compatibility. In the next version of Pyro, you can use pyro.deterministic for this purpose.

using the mean_function= argument for the GP

I believe that just simply defining mean_function=parametric_mean also works? This way you don't need to invoke set_data stuff or wrap gp.model by a model method. But I might have overlooked something...

autoguide.AutoMultivariateNormal

I believe this is a limitation of the GP module, where autoguide method only defines guides separately each parameter, so AutoMultivariateNormal won't work. The workaround is to use infer.autoguide.AutoMVN as you did but this only works for GP regression models (GPR and SGPR). :)

not have to think about guides too much

Agreed!!! I rarely write a guide in Pyro. :D

alan-turing-institute / SBO_Pyro

Updates for Pyro 1.0 release #5