Hi, I have a question regarding the computation of the GGN approximation for the Hessian in the regression case:
def _get_full_ggn(self, Js, f, y):
"""Compute full GGN from Jacobians.
Parameters
----------
Js : torch.Tensor
Jacobians `(batch, parameters, outputs)`
f : torch.Tensor
functions `(batch, outputs)`
y : torch.Tensor
labels compatible with loss
Returns
-------
loss : torch.Tensor
H_ggn : torch.Tensor
full GGN approximation `(parameters, parameters)`
"""
loss = self.factor * self.lossfunc(f, y)
if self.likelihood == 'regression':
H_ggn = torch.einsum('mkp,mkq->pq', Js, Js)
When I compare that with equation 9 in your paper Bayesian Deep Learning via Subnetwork Inference I do not see the Hessian of the negative log-likelihood w.r.t. model outputs. Why does that term vanish in the regression case and is just the product of the Jacobians?
the Hessian of the negative log-likelihood w.r.t. model outputs is implicitly there, since it is simply the identity matrix, see appendix A.2 of this paper.
Hi, I have a question regarding the computation of the GGN approximation for the Hessian in the regression case:
When I compare that with equation 9 in your paper Bayesian Deep Learning via Subnetwork Inference I do not see the Hessian of the negative log-likelihood w.r.t. model outputs. Why does that term vanish in the regression case and is just the product of the Jacobians?