Closed ziqizh closed 4 years ago
I had the same issue. Part of the problem was solved when loading nvidia and nvidia-uvm drivers. My nvidia drivers were disabled by some blacklist configuration, in /lib/modprobe.d/blacklist-nvidia.conf. I just commented all the lines there.
Also, you need to launch the visdom server in a new shell:
python3 -m visdom.server
But the problem still remains partly:
Setting up a new session...
Running PGAN
size 10
202599 images found
202599 images detected
size (4, 4)
202599 images found
Changing alpha to 0.000
Traceback (most recent call last):
File "./train.py", line 151, in <module>
GANTrainer.train()
File "/home/user/pytorch-gan-zoo/pytorch_GAN_zoo/models/trainer/progressive_gan_trainer.py", line 237, in train
maxIter=self.modelConfig.maxIterAtScale[scale])
File "/home/user/pytorch-gan-zoo/pytorch_GAN_zoo/models/trainer/gan_trainer.py", line 479, in trainOnEpoch
inputs_real = self.inScaleUpdate(i, scale, inputs_real)
File "/home/user/pytorch-gan-zoo/pytorch_GAN_zoo/models/trainer/progressive_gan_trainer.py", line 166, in inScaleUpdate
self.model.updateAlpha(alpha)
File "/home/user/pytorch-gan-zoo/pytorch_GAN_zoo/models/progressive_gan.py", line 134, in updateAlpha
self.avgG.module.setNewAlpha(newAlpha)
File "/home/user/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 576, in __getattr__
type(self).__name__, name))
AttributeError: 'GNet' object has no attribute 'module'
I could solve the last part of the problem using the solution below.
Per https://github.com/pytorch/pytorch/issues/28321
you need also to update your CUDA drivers.
The error of torch.cuda.is_available()
returning false
is what provokes 'GNet' object has no attribute 'module'
.
It turned out to be the display issue: my linux environment doesn't have a display and I solved this by commenting out the related code.
python3 -m visdom.server
worked for debugging my error.
I am using Python 3.8 and torch 1.3.1