cornellius-gp / gpytorch

A highly efficient implementation of Gaussian Processes in PyTorch
MIT License
3.57k stars 560 forks source link

How to perform hessian vector product in a GPyTorch kernel? #831

Open Akella17 opened 5 years ago

Akella17 commented 5 years ago

I need to design a (Fisher) kernel that is of the format: k(x1, x2) = ∇θx1T Hθx2, Where H = ∇θ[∇θ(KL)], is the hessian of KL divergence.

Since it is computationally intractable to calculate H beforehand for the kernel, I cannot plainly provide a kernel definition beforehand. Instead, I can define a function that outputs the covariance matrix vector product on-the-fly during the prediction stage, i.e., H.v = ∇θ(∇θ(KL).v) K.v = ∇θ(X1.∇θ (∇θ(KL).∇θ(X2.v)))

Here, KL is a scalar, v, X1, and X2 are t dimensional vectors with t being the number of training examples. All the matrices are bolded for easy distinction.

Is there any way I can achieve this kernel with the existing GPyTorch library?

jacobrgardner commented 5 years ago

@Akella17 I think this is going to end up being more a PyTorch thing than a GPyTorch thing in general.

PyTorch supports Hessian vector products via two sequential torch.autograd.grad calls, and all relevant custom GPyTorch functions are twice differentiable. If you google around, there are a few discussion posts on the pytorch forums about how to do this.

Akella17 commented 5 years ago

@jacobrgardner I just need a little help locating where exactly the conjugate gradient step over the lazy covariance matrix is happening so that I can create a special case for my custom kernel there.