amirgholami / PyHessian

PyHessian is a Pytorch library for second-order based analysis and training of Neural Networks
MIT License
694 stars 119 forks source link

Potential bugs #19

Open yxiao54 opened 1 year ago

yxiao54 commented 1 year ago

Line 71 of utils.py grads.append(0 if param.grad is None else param.grad + 0.) should rewrite as: grads.append(param-param if param.grad is None else param.grad + 0.)

The current implementation may cause bugs when there are unused layers in the model. To be specific, when a layer was set require_grad as true but doesn't participate in forward or backward participation, it's grad was set as float zero. It will trigger an error when torch.autograd checks the shape of grads. Detail can be seen in this discussion: https://github.com/amirgholami/PyHessian/issues/8