Why the optimizer.step() write twice？

thuml / Anomaly-Transformer

About Code release for "Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy" (ICLR 2022 Spotlight), https://openreview.net/forum?id=LzQQ89U1qm_

MIT License

763 stars 202 forks source link

Why the optimizer.step() write twice？ #8

Closed San-ctuary closed 2 years ago

San-ctuary commented 2 years ago

https://github.com/thuml/Anomaly-Transformer/blob/bfe075e4f3a0be789f168b2aeee7a4ce30482ce5/solver.py#L189-L192 When the first optimizer.step() execute all the gradient relate to loss1 will update，but some variable in loss2 are common in loss1.So this maybe cause some problem.

xuchunyu123 commented 2 years ago

你好，我能问下你这个问题最后是怎么解决的吗，我也是在这里报错

San-ctuary commented 2 years ago

作者已经在别的issue里面回答了，应该是pytorch版本问题，如果不降低版本的话直接把第一个.step注释掉就行

deepConnectionism commented 2 years ago

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [512, 25]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

通过楼上的方案解决了感谢