I find the inpainting network trained very slow. I want to train the inpainting network on multi-gpus. But I find that after I add torch.nn.DataParallel to the code. I always meet the following errors:
RuntimeError: Expected tensor for argument #1 'input' to have the same device as tensor for argument #2 'weight'; but device 1 does not equal 0 (while checking arguments for cudnn_convolution)
This error appeared in Line 623 of inpainting_unet.py x = self.unet(x). More specifically, it appeared in Line 222 of model/common_blocks.py x, x_unpooled = self.encoders[i](x), which was called by self.unet(x).
I tried many solutions in the web. But it has been not resolved.
I find the inpainting network trained very slow. I want to train the inpainting network on multi-gpus. But I find that after I add torch.nn.DataParallel to the code. I always meet the following errors:
RuntimeError: Expected tensor for argument #1 'input' to have the same device as tensor for argument #2 'weight'; but device 1 does not equal 0 (while checking arguments for cudnn_convolution)
This error appeared in Line 623 of inpainting_unet.pyx = self.unet(x)
. More specifically, it appeared in Line 222 of model/common_blocks.pyx, x_unpooled = self.encoders[i](x)
, which was called by self.unet(x).I tried many solutions in the web. But it has been not resolved.