Training network with Multi-GPUs Error

Hi, @amlankar. I try to train the model with Multi GPUs, beacause multi-gpus training not only can reduce trianing time hugely but also imporve the accurate with bigger batch size in theory.But i got an AssertionError unfortunately. The steps that i did as bellow: I changed the model with nn.DataParaller() method like this:

model = polyrnnpp.PolyRNNpp(self.opts)
model = nn.DataParallel(model, device_ids=(0, 1))  # i have two 1080ti GPU devices
self.model = model.cuda()

for training, i put input data to cuda like this:

 # Forward pass
 output = self.model(data['img'].type(self.dtype).cuda(), data['fwd_poly'].type(self.dtype).cuda())

Then, the AssertionError occured:

Starting training
Saved model
/home/ztian5/.local/lib/python2.7/site-packages/torch/nn/functional.py:1006: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
/home/ztian5/.local/lib/python2.7/site-packages/torch/nn/functional.py:995: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead.
  warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.")
Traceback (most recent call last):
  File "/mnt/data/polygonRNN_pluss/code/Scripts/train/train_ce.py", line 328, in <module>
    trainer.loop()
  File "/mnt/data/polygonRNN_pluss/code/Scripts/train/train_ce.py", line 148, in loop
    self.train(epoch)
  File "/mnt/data/polygonRNN_pluss/code/Scripts/train/train_ce.py", line 163, in train
    output = self.model(data['img'].type(self.dtype).cuda(), data['fwd_poly'].type(self.dtype).cuda())
  File "/home/ztian5/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ztian5/.local/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 124, in forward
    return self.gather(outputs, self.output_device)
  File "/home/ztian5/.local/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 136, in gather
    return gather(outputs, output_device, dim=self.dim)
  File "/home/ztian5/.local/lib/python2.7/site-packages/torch/nn/parallel/scatter_gather.py", line 67, in gather
    return gather_map(outputs)
  File "/home/ztian5/.local/lib/python2.7/site-packages/torch/nn/parallel/scatter_gather.py", line 61, in gather_map
    for k in out))
  File "/home/ztian5/.local/lib/python2.7/site-packages/torch/nn/parallel/scatter_gather.py", line 61, in <genexpr>
    for k in out))
  File "/home/ztian5/.local/lib/python2.7/site-packages/torch/nn/parallel/scatter_gather.py", line 54, in gather_map
    return Gather.apply(target_device, dim, *outputs)
  File "/home/ztian5/.local/lib/python2.7/site-packages/torch/nn/parallel/_functions.py", line 52, in forward
    assert all(map(lambda i: i.is_cuda, inputs))
AssertionError

Process finished with exit code 1

bty, i trained the network with single GPU fine, but when i use multi-gpu to train the network with 'nn.DataParaller()', the Error occured. Can you give me some advices on training the network with multi-gpu devices or what's wrong with i did ? Appreciative for your reply ^_^.

fidler-lab / polyrnn-pp-pytorch

Training network with Multi-GPUs Error #21