uoip / SSD-variants

PyTorch implementation of several SSD based object detection algorithms.
MIT License
240 stars 56 forks source link

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation #3

Closed seongkyun closed 6 years ago

seongkyun commented 6 years ago

Hello. I've tried to run this code with

python train.py --cuda --voc_root ~/data/VOCdevkit --batch_size 16 --backbone ./vgg16_reducedfc.pth

but it occurs errors below

argparser: Namespace(backbone='./vgg16_reducedfc.pth', batch_size=16, checkpoint='', cuda=True, demo=False, lr=0.001, seed=233, start_iter=0, test=False, threads=4, voc_root='/home/han/data/VOCdevkit') ./models/SSD.py:22: UserWarning: nn.init.constant is now deprecated in favor of nn.init.constant_. nn.init.constant(self.weight, scale) ./models/SSD.py:111: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_. nn.init.xavier_uniform(m.weight.data) Backbone loaded! /home/han/virtualenv/py36/lib/python3.6/site-packages/torch/nn/functional.py:52: UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead. warnings.warn(warning.format(ret)) Traceback (most recent call last): File "train.py", line 268, in <module> train() File "train.py", line 149, in train loss.backward() File "/home/han/virtualenv/py36/lib/python3.6/site-packages/torch/tensor.py", line 93, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/home/han/virtualenv/py36/lib/python3.6/site-packages/torch/autograd/__init__.py", line 90, in backward allow_unreachable=True) # allow_unreachable flag

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

So, I changed all the inplace=True options in SSD.py For example in line 81,

from x = F.relu(m(x), inplace=True) to x = F.relu(m(x), inplace=False) .

But it occurs same error.

Is there any solution?

my environment is exactly same with required environment. (python3.6 with other requirements)

seongkyun commented 6 years ago

It is solved with Pytorch 0.3 . See #https://github.com/pytorch/pytorch/issues/7329

yosunpeng commented 5 years ago

One RuntimeError occurred when I tried to train SSD on VOC dataset. You may find more reference from Pytorch Forum: https://discuss.pytorch.org/t/encounter-the-runtimeerror-one-of-the-variables-needed-for-gradient-computation-has-been-modified-by-an-inplace-operation/836 In my case, I changed the x variable in SSD.py and it worked for me. Hope helps for you! Line 25 in model/SSD.py x /= (x.pow(2).sum(dim=1, keepdim=True).sqrt() + 1e-10) to x /= (x.clone().pow(2).sum(dim=1, keepdim=True).sqrt() + 1e-10) You may face same issue when implement other model like SSD512, you can debug by your self with detection feature in pytorch. Change codes in train.py: Line 268 in train.py train() to with torch.autograd.set_detect_anomaly(True): train() for more information: pytorch/pytorch#15803