Open vr308 opened 4 years ago
Hmm, this is actually surprisingly challenging to do right now because every time you call one of the getters (e.g., kernel.outputscale
, it returns a newly transformed version of the raw outputscale.
Maybe one thing we could do is have these getters cache the transformed version, and clear the cache whenever backward is called? This would let you do something like torch.autograd.grad(mll, model.covar_module.base_kernel.lengthscale)
and have it actually work.
Let me prototype that for lengthscales and see if it causes any problems.
There is a way to extract the gradient vector w.r.t the "raw" or transformed hypers but I was wondering if I could obtain the gradients w.r.t the original hypers..
The use case is to visualise the gradient field superimposed on the negative log marginal likelihood surface like below: