Closed JBaum2000 closed 1 year ago
maybe the optimizer from transformers AdamW,try using torch.optim.AdamW
From your code it looks like you were using the AdamW as suggested by @stephen-nju (thanks for helping out ❤️). This was fixed here #18268, so I'm closing the issue. But let me know if this doesn't solve it for you.
As a workaround, if you can't wait for the fix to be released, you can manually set torch.set_grad_enabled(True)
at the beginning of your training step.
Bug description
With
precision=32
in the Trainer, aRuntimeError
occurs after the lasttraining_step
before optimization. The runtime error appears to be associated withtorch.is_grad_enabled()
being set toFalse
in the last iteration of thetraining_step
. Withprecision=16
the error does not occur andtorch.is_grad_enabled()
printsTrue
on each iteration.The error seems similar to #17949 .
What version are you seeing the problem on?
v2.0
How to reproduce the bug
Error messages and logs
Environment
cc @borda