How to perform hessian vector product in a GPyTorch kernel?

Akella17 commented 5 years ago

I need to design a (Fisher) kernel that is of the format: k(x₁, x₂) = ∇_θx₁^T H ∇_θx₂, Where H = ∇_θ[∇_θ(KL)], is the hessian of KL divergence.

Since it is computationally intractable to calculate H beforehand for the kernel, I cannot plainly provide a kernel definition beforehand. Instead, I can define a function that outputs the covariance matrix vector product on-the-fly during the prediction stage, i.e., H.v = ∇_θ(∇_θ(KL).v) K.v = ∇_θ(X₁.∇_θ (∇_θ(KL).∇_θ(X₂.v)))

Here, KL is a scalar, v, X1, and X2 are t dimensional vectors with t being the number of training examples. All the matrices are bolded for easy distinction.

Is there any way I can achieve this kernel with the existing GPyTorch library?

jacobrgardner commented 5 years ago

@Akella17 I think this is going to end up being more a PyTorch thing than a GPyTorch thing in general.

PyTorch supports Hessian vector products via two sequential torch.autograd.grad calls, and all relevant custom GPyTorch functions are twice differentiable. If you google around, there are a few discussion posts on the pytorch forums about how to do this.

Akella17 commented 5 years ago

@jacobrgardner I just need a little help locating where exactly the conjugate gradient step over the lazy covariance matrix is happening so that I can create a special case for my custom kernel there.

cornellius-gp / gpytorch

How to perform hessian vector product in a GPyTorch kernel? #831