Open Akella17 opened 5 years ago
@Akella17 I think this is going to end up being more a PyTorch thing than a GPyTorch thing in general.
PyTorch supports Hessian vector products via two sequential torch.autograd.grad
calls, and all relevant custom GPyTorch functions are twice differentiable. If you google around, there are a few discussion posts on the pytorch forums about how to do this.
@jacobrgardner I just need a little help locating where exactly the conjugate gradient step over the lazy covariance matrix is happening so that I can create a special case for my custom kernel there.
I need to design a (Fisher) kernel that is of the format: k(x1, x2) = ∇θx1T H ∇θx2, Where H = ∇θ[∇θ(KL)], is the hessian of KL divergence.
Since it is computationally intractable to calculate H beforehand for the kernel, I cannot plainly provide a kernel definition beforehand. Instead, I can define a function that outputs the covariance matrix vector product on-the-fly during the prediction stage, i.e., H.v = ∇θ(∇θ(KL).v) K.v = ∇θ(X1.∇θ (∇θ(KL).∇θ(X2.v)))
Here, KL is a scalar, v, X1, and X2 are t dimensional vectors with t being the number of training examples. All the matrices are bolded for easy distinction.
Is there any way I can achieve this kernel with the existing GPyTorch library?