isht7 / pytorch-deeplab-resnet

DeepLab resnet v2 model in pytorch
MIT License
602 stars 118 forks source link

Unable to train on multiple gpus #27

Closed omkar13 closed 6 years ago

omkar13 commented 6 years ago

Hi! I have two 1080ti nvidia gpus. I followed this tutorial : http://pytorch.org/tutorials/beginner/blitz/data_parallel_tutorial.html. However, I am getting the following error:

Traceback (most recent call last): File "/home/omkar/pycharm-community-2017.3.4/helpers/pydev/pydev_run_in_console.py", line 53, in run_file pydev_imports.execfile(file, globals, locals) # execute the script File "/home/omkar/Documents/Omkar/PycharmProjects/Masktrack1/pytorch-deeplab-resnet-master/train_online_multiple_objs&gpus.py", line 325, in lr=base_lr, momentum=0.9, weight_decay=weight_decay) File "/home/omkar/anaconda3/envs/deeplab_resnet/lib/python2.7/site-packages/torch/optim/sgd.py", line 57, in init super(SGD, self).init(params, defaults) File "/home/omkar/anaconda3/envs/deeplab_resnet/lib/python2.7/site-packages/torch/optim/optimizer.py", line 39, in init self.add_param_group(param_group) File "/home/omkar/anaconda3/envs/deeplab_resnet/lib/python2.7/site-packages/torch/optim/optimizer.py", line 146, in add_param_group param_group['params'] = list(params) File "/home/omkar/Documents/Omkar/PycharmProjects/Masktrack1/pytorch-deeplab-resnet-master/train_online_multiple_objs&gpus.py", line 106, in get_1x_lr_params_NOscale b.append(model.Scale.conv1) File "/home/omkar/anaconda3/envs/deeplab_resnet/lib/python2.7/site-packages/torch/nn/modules/module.py", line 398, in getattr type(self).name, name)) AttributeError: 'DataParallel' object has no attribute 'Scale'

The error is present because the scale attribute is being used in the 'get_1x_lr_params_NOscale' method. However, the network has been wrapped by the DataParallel class and hence the error. Could you suggest a solution for the problem? Thank you!

omkar13 commented 6 years ago

Problem solved. Referred to this issue: https://discuss.pytorch.org/t/how-to-reach-model-attributes-wrapped-by-nn-dataparallel/1373