Open azhangmn opened 2 years ago
Hi, @azhangmn. It seems that an inplace ReLU operation results to this error.
Could you check where this operation is used? In addition, the input of this ReLU layer has a shape of [2, 512, 27, 27].
Hi @impiga, it seems that every part of the model except the swin transformer uses inplace ReLU. However, these parameters are imported from default settings in mmcv, if I have interpreted the code correctly. Do you have any suggestions to fix this problem? Thanks!
I found the solution, which was merged into the master branch of mmseg: https://github.com/open-mmlab/mmsegmentation/pull/1103. Turns out the uper_head.py code in mmseg/models/decode_heads will throw an error on line 104 due to the += operation. Similarly, the code in fpn.py under mmseg/models/necks will throw the error due to the += operation on lines 178 and 181. Please update this code.
Hi there, I am trying to train swin_upernet on a custom dataset. My config file is as follows:
However, when I try to run training, I get the error:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [2, 512, 27, 27]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
This is while running torch anomaly to try to detect the source of the error.
Has anyone dealt with this before? How to fix the issue?