noahgolmant / pytorch-hessian-eigenthings

Efficient PyTorch Hessian eigendecomposition tools!
MIT License
360 stars 43 forks source link

Hessian for multilayer network #19

Closed gudovskiy closed 5 years ago

gudovskiy commented 5 years ago
  1. It seems that you are concatenating parameters from multiple layers to calculate one large Hessian as opposed to multiple Hessians for each layer. Is that right?
  2. Did you try to play with estimating Hessians for each layer separately?
noahgolmant commented 5 years ago

The Hessian-vector product is computed using PyTorch's autodiff feature. The torch.autograd.grad function returns these values for each parameter group in the module. After this, I concatenate them into a single vector.

I am not sure why you would need to separately compute only one layer's Hessian at a time. You already have to do backprop from the later layers back to your current layer to get it, so you would just be increasing the amount of work you do to be quadratic in the number of layers.