ExactGP.predict and viGP.predict produce inconsistent shapes

matthewcarbone commented 1 year ago

The predict method on ExactGP and viGP produce results of different shapes.

ExactGP.predict produces a 3-tensor, e.g. (2000, 200, 100).

viGP.predict produces a 1-tensor, e.g. (100,).

Is there any way to standardize the output of these methods? Also, it appears 2000 is the number of samples after warmup, and 200 samples from the posterior. Maybe the output of viGP.predict should be (1, 200, 100) to make it consistent (since there's only a single value for e.g. samples["k_length"]). This should be easy enough to do by just using mean, cov = self.get_mvn_posterior(X_new, samples, noiseless, **kwargs) to draw 200 samples, I think. Let me know if I have this totally wrong.

In addition, it begs the question if a ABC for GPs should really be being used. It would probably be best for the user if, in all cases possible, every core method of each GP produces the same type of object. I see that viGP inherits ExactGP, but it might be best to have ExactGP (or whatever the most base class is) inherit from some ABC.

matthewcarbone commented 1 year ago

Happy to attempt a fix if you'd like.

ziatdinovmax commented 1 year ago

In the beginning, both ExactGP and viGP used to output identical shapes. However, while y_sampled in ExactGP makes sense since each sample comes from a different HMC sample with kernel hyperparameters, the y_sampled in viGP is sampled from the same single point estimate of the kernel hyperparameters, which makes it pretty noisy. Hence, for practical purposes, viGP returns the predictive variance values directly.

That said, there is certainly an "asymmetry" between fully Bayesian and variational inference tools currently available. I need to think a bit more about how to address it from the design point of view.

matthewcarbone commented 1 year ago

@ziatdinovmax I see what you're saying but ultimately doesn't it make more sense from a design standpoint to have a common interface? It's going to make building tools on top of what you already have really difficult if each of the GP's predict method returns something different.

IMO making this really clear in the docs but implementing the consistent method is the way to go. That way, a GP that inherits some base ABC has totally known behavior when calling predict (which naturally will make e.g. building the campaigning library much easier).

ziatdinovmax commented 1 year ago

@matthewcarbone - adding it to the v0.3 'milestone' per our discussion

matthewcarbone commented 1 year ago

@ziatdinovmax yup sounds good. To be clear this would actually be a backwards-incompatible change (technically), since the shape of the predict method would change. Should we make such a change, or should we add a new method which returns a consistent shape between the GP methods?

ziatdinovmax commented 1 year ago

I guess the predict method should just always output predictive mean and variance. Then we can have another method specific to HMC-based GPs that returns samples from the HMC posterior.

ziatdinovmax commented 1 year ago

In fact, we can have .sample_from_posterior() method for all GPs that it will output samples of the shape (M, N, len(X_new)). For the HMC-based GPs, M is equal to the number of HMC posterior samples and for SVI-based GPs M is equal to 1.

matthewcarbone commented 1 year ago

Perfect, I really like the idea of sample_from_posterior().

ziatdinovmax / gpax

ExactGP.predict and viGP.predict produce inconsistent shapes #49