facebookresearch / 3detr

Code & Models for 3DETR - an End-to-end transformer model for 3D object detection
Apache License 2.0
629 stars 79 forks source link

RuntimeError: ReluBackward0, is at version 1; expected version 0 instead #55

Closed yuanze1024 closed 11 months ago

yuanze1024 commented 11 months ago

I tried a 90 epoch training for sanity check on a 3090 GPU, torch 1.13, and an exception just occurred:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [256, 8, 256]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

So I just removed all the inplace=True in this repo(however, they are all Dropout instead of Relu mentioned in the Traceback), and the question is solved. Maybe it will slow a bit or not, but better than nothing.

In case someone may face the same problem, leaving a message here.

Serzhanov commented 4 months ago

Thanks, man. I encountered the same error, and it helped me. Now my Precision and Recall are 0 over all classes. Do you have any suggetions please?