dlyldxwl / fssd.pytorch

Pytorch re-implementation of Fssd
67 stars 27 forks source link

loss.backward() #2

Open DW1HH opened 5 years ago

DW1HH commented 5 years ago

Traceback (most recent call last): File "train.py", line 268, in train() File "train.py", line 233, in train loss.backward() File "/home/huhuai/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 93, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/home/huhuai/anaconda3/lib/python3.6/site-packages/torch/autograd/init.py", line 90, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

dlyldxwl commented 5 years ago

@DW1HH The reason that leads to the error is the model file contain the in_place operation, you need trun off the switch of in_place in models/fssd_vgg.py

rw1995 commented 5 years ago

@DW1HH The reason that leads to the error is the model file contain the in_place operation, you need trun off the switch of in_place in models/fssd_vgg.py

sorry, i also meet this problem,when i train voc2007, could you tell how to fix it carefully(细致的=-=)?

rw1995 commented 5 years ago

like this? ` Loading base network... Initializing weights... train.py:98: UserWarning: nn.init.kaiming_normal is now deprecated in favor of nn.init.kaimingnormal. init.kaiming_normal(m.state_dict()[key], mode='fan_out') Loading Dataset... Training FSSD_VGG on VOC0712 avg_loss_list: [0.0] /home/rw/anaconda3/envs/pytorch0.4.0/lib/python3.5/site-packages/torch/nn/functional.py:2539: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. "See the documentation of nn.Upsample for details.".format(mode)) /home/rw/anaconda3/envs/pytorch0.4.0/lib/python3.5/site-packages/torch/nn/_reduction.py:46: UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead. warnings.warn(warning.format(ret)) Traceback (most recent call last): File "train.py", line 262, in train() File "train.py", line 227, in train loss.backward() File "/home/rw/anaconda3/envs/pytorch0.4.0/lib/python3.5/site-packages/torch/tensor.py", line 107, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/home/rw/anaconda3/envs/pytorch0.4.0/lib/python3.5/site-packages/torch/autograd/init.py", line 93, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [16, 256, 9, 9]], which is output 0 of ReluBackward1, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

`

rw1995 commented 5 years ago

i replace fssd_vgg.py ` for k, v in enumerate(self.extras):

x = F.relu(v(x), inplace=True)

        x = F.relu(v(x))

` and it works ,but found L: nan C: nan S: nan,so i think what i do is wrong

naviocean commented 4 years ago

I got the same issue. After changed inplace=False, it returned nan C: nan S: nan while training

zhaohao0404 commented 4 years ago

i replace fssd_vgg.py for k, v in enumerate(self.extras): # x = F.relu(v(x), inplace=True) x = F.relu(v(x)) and it works ,but found L: nan C: nan S: nan,so i think what i do is wrong

hi,my friend, have you solved this problem? Those Usewarings can be corrected easily. if you are still trouble with them, you can contact me for the solutions.

Shawn0Hsu commented 3 years ago

i replace fssd_vgg.py for k, v in enumerate(self.extras): # x = F.relu(v(x), inplace=True) x = F.relu(v(x)) and it works ,but found L: nan C: nan S: nan,so i think what i do is wrong

hi,my friend, have you solved this problem? Those Usewarings can be corrected easily. if you are still trouble with them, you can contact me for the solutions.

Hello, I also encountered a similar problem, how did you solve it? Thank you @chaomartin @naviocean @rw1995

YXB-NKU commented 2 years ago

I got the same issue. After changed inplace=False, it returned nan C: nan S: nan while training

i have got this problem too, have you solve it yet?