Open bloodwass opened 5 years ago
First of all, there is no guaranteed support for the usage of private methods _weight_drop
.
That said, were you able to figure out a solution to this issue? Any changes that should be contributed to torchnlp
?
Same problem when I use WeightDropLinear. It shows "arguments are located on different GPUs ".
Similar problem here. When I try to use WeightDropGRU and move it to GPU, I get an error: AttributeError: 'WeightDropGRU' object has no attribute '_flat_weights'
Any idea how to solve this?
Expected Behavior
I want to convert torch.nn.Linear modules to weight drop linear modules in my model (possibly big), and I want to train my model with multi-GPUs. However, I have RuntimeError in my sample code. First, I have _weight_drop() which drops some part of weights in torch.nn.Linear (see the code below).
Actual Behavior
RuntimeError: arguments are located on different GPUs at /opt/conda/conda-bld/pytorch_1556653114079/work/aten/src/THC/generic/THCTensorMathBlas.cu:255
Steps to Reproduce the Problem
Run this code in python 3.7 and pytorch 1.1.0 with 2-GPUs
However, “output=model(input)” is not computed in this code with this error message.
The main reason for this error is that I try to compute linear multiplication between two tensors belonging to different GPUs. I try to modify my _weight_drop() function to manually assign the current device in the DataParallel process, but it does not work. Is there any idea to figure out this problem? This code works fine in single GPU or CPU mode