训练时遇到的问题

zh9369 commented 4 years ago

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [4, 3, 19, 19, 85]], which is output 0 of AsStridedBackward, is at version 6; expected version 3 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True). 我训练自己的数据集遇到这个问题，之前重来没有遇到过，请问有解决办法吗？

karmueo commented 4 years ago

我也遇到这个问题了,有解决方法么?

Tianxiaomo commented 4 years ago

可以把完整的错误信息贴出来吧 @zh9369 @karmueo

zh9369 commented 4 years ago

Traceback (most recent call last): File "C:/Users/the_moon/Desktop/python/YOLOV/yolov4/pytorch-YOLOv4-master/train.py", line 428, in device=device, ) File "C:/Users/the_moon/Desktop/python/YOLOV/yolov4/pytorch-YOLOv4-master/train.py", line 308, in train loss.backward() File "C:\Users\the_moon\AppData\Local\Continuum\anaconda3\lib\site-packages\torch\tensor.py", line 118, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "C:\Users\the_moon\AppData\Local\Continuum\anaconda3\lib\site-packages\torch\autograd__init__.py", line 93, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [4, 3, 19, 19, 85]], which is output 0 of AsStridedBackward, is at version 6; expected version 3 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Process finished with exit code 1

这是完整的报错信息

Tianxiaomo commented 4 years ago

@zh9369 pytorch 是哪个版本的

zh9369 commented 4 years ago

好像可以了，经过你的提醒我才发现我的是1.2的版本，升级到1.5版本也可以训练，期待后续结果，谢谢你的帮助！

karmueo commented 4 years ago

同样的问题,也是升级到1.5后解决,感谢两位的帮助.

jiaoxiaosong commented 3 years ago

具体是代码里哪里导致的？

Sukeysun commented 3 years ago

具体是代码里哪里导致的？

是train.py的YOLO_Loss导致出现这个问题的。output在YOLO_LOSS里面被修改了，反向更新的时候就出问题了。可以创建一个跟output相同尺寸的变量，然后将修改后的值写到这个变量里面就可以解决这个问题了

Tianxiaomo / pytorch-YOLOv4

训练时遇到的问题 #34