cornellius-gp / gpytorch

A highly efficient implementation of Gaussian Processes in PyTorch
MIT License
3.54k stars 557 forks source link

[Feature Request] Student T Processes #1858

Open johnryan465 opened 2 years ago

johnryan465 commented 2 years ago

🚀 Feature Request

Motivation

Student T Processes are the other member of the family of elliptical processes along with Gaussian Processes. In some situations these have preferable statistical properties.

Student-t Processes as Alternatives to Gaussian Processes

Pitch

Adding this to GPytorch would enable users to work with a new family of processes which will be useful for many people. I would like to invite suggestions in how to implement this in the code base before working on the pull request. For example, should all GPs be made a subclass of a new elliptical processes class, or should Student T Processes be written in a fashion which would cause minimal changes to other aspects of GPytorch code?

Are you willing to open a pull request?

Absolutely willing to work on this problem.

jacobrgardner commented 2 years ago

I personally think this would be pretty cool to support. My hunch is that there will be a ton of code reuse at the LazyTensor level and lower since they both just involve a lot of manipulation of positive definite matrices, but not much code reuse at the ExactGP/ApproximateGP or the PredictionStrategy level. This is probably good news overall, since the linear algebra is already there and reasonably well implemented via LazyTensor, and it's just a matter of building the model on top of that.

Basically I would think we'd have an analog of the GP class, and then of ExactGP and maybe ApproximateGP. I think that would be the best balance of still having significant code reuse in terms of the linear algebra operations (all models use LazyTensor to represent positive definite matrices), but is separated enough for the GP side of things that it'll be less likely to cause new problems there or become too hacky.

There are also some interesting questions, like what sparse / variational / deep versions of the model looks like. I've done exactly 0 reading on the topic, so some of those questions may already have answers but I certainly haven't (personally) seen them widely explored.

wjmaddox commented 2 years ago

At the prediction strategy level, this may not be that difficult to implement as the conditional multivariate t distribution ends up having all of the same components as the multivariate Gaussian. The difference is a couple of rescaling terms that would need to be separately computed: Screen Shot 2022-01-03 at 9 40 34 AM

(from https://arxiv.org/pdf/1402.4306.pdf)

just $n_1$ (which is just an attribute) and $\beta_1$. I'd assume training would be straightforward via the multivariate t log likelihood with the only difference being that the likelihood term for the Gaussian setting would need to be worked in beforehand.

Happy to discuss in more detail / put up a quick implementation if that would be helpful.

johnryan465 commented 2 years ago

I have started working on a pull request here. It is still a work in progress, have only added a Multivariate Student T distribution at this point, but will continue working on this concept there.