Open Kingsleyandher opened 1 year ago
class HutchinsonEstimator(HessianEstimator):
def estimate(self, p, grad):
u = torch.randn_like(grad)
grad_dot_u = torch.sum(grad * u)
print(f"grad_dot_u requires grad: {grad_dot_u.requires_grad}") # -> False
# ↓ RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn.
hessian_vector_product = torch.autograd.grad(
grad_dot_u, p, retain_graph=True)[0]
return u * hessian_vector_product
This problem same like #7 .
Hello @Kingsleyandher , I meet the same question, is your problem solved?
Hello, there was an error when I used the Sophia optimizer to train GPT3 with Megatron. The error point is that
grad
cannot be substituted into the optimizer withrequire_grad = True
state to calculate the second derivative. Do you know how to solve this problem?File "/root/miniconda3/envs/torch18/lib/python3.7/site-packages/torch/autograd/__init__.py", line 277, in grad allow_unused, accumulate_grad=False) # Calls into the C++ engine to run the backward pass RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn.
Upvote & Fund