knazeri / edge-connect

EdgeConnect: Structure Guided Image Inpainting using Edge Prediction, ICCV 2019 https://arxiv.org/abs/1901.00212
http://openaccess.thecvf.com/content_ICCVW_2019/html/AIM/Nazeri_EdgeConnect_Structure_Guided_Image_Inpainting_using_Edge_Prediction_ICCVW_2019_paper.html
Other
2.5k stars 530 forks source link

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation #124

Open renleidewenming opened 4 years ago

renleidewenming commented 4 years ago

When I train the edge model, RuntimeError appears as follows, I don't know how to deal with it. Anyoneone helps me? My pytorch version is 1.0.0a0+90737f7

Traceback (most recent call last): File "train.py", line 2, in main(mode=1) File "/home/share/edge-connect/main.py", line 56, in main model.train() File "/home/share/edge-connect/src/edge_connect.py", line 115, in train self.edge_model.backward(gen_loss, dis_loss) File "/home/share/edge-connect/src/models.py", line 145, in backward dis_loss.backward() File "/usr/local/lib/python3.5/dist-packages/torch/tensor.py", line 102, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/usr/local/lib/python3.5/dist-packages/torch/autograd/init.py", line 90, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

Guanyunlph commented 3 years ago

Hi, I met the same problem. Have you solved it?

g-h-anna commented 3 years ago

Hi, I met the same problem. Have you solved it?

Hi, do you have a solution?

cgsaxner commented 3 years ago

I ran into the same problem when trying to train the network with a newer PyTorch version (1.9.0). It looks like it is connected to this issue: https://github.com/pytorch/pytorch/issues/39141

According to what is described in this link, I adjusted the backward functions in the edge and inpaint model so that both backward() passes are done before the optimizer steps, e.g.:

def backward(self, gen_loss=None, dis_loss=None):
  dis_loss.backward()
  gen_loss.backward()

  self.dis_optimizer.step()
  self.gen_optimizer.step()

Since the optimizer checks causing this error seem to not have been implemented in PyTorch versions < 1.5.0, it might be that the computed gradients are actually not correct when using earlier versions.

dhruvagarwal commented 3 years ago

The problem happens because of the pytorch version > 1.5.0. This project uses pytorch version 1.0, try sticking to that and torchvision 0.3.0 to run it.