lippman1125 / pytorch_FAN

face alignment pytorch training
Other
63 stars 10 forks source link

CUDA out of memory. #3

Open ghost opened 5 years ago

ghost commented 5 years ago
TrainingTraceback (most recent call last):
  File "main.py", line 323, in <module>
    main(args)
  File "main.py", line 137, in main
    args.debug, args.flip)
  File "main.py", line 198, in train
    output = model(input_var)
  File "/home/john/Virtualenv/venv/lib/python3.5/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/john/Virtualenv/venv/lib/python3.5/site-packages/torch/nn/parallel/data_parallel.py", line 141, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/john/Virtualenv/venv/lib/python3.5/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/john/Documents/Projects/Pytorch/pytorch_FAN-master/models/fan_model.py", line 184, in forward
    hg = self._modules['m' + str(i)](previous)
  File "/home/john/Virtualenv/venv/lib/python3.5/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/john/Documents/Projects/Pytorch/pytorch_FAN-master/models/fan_model.py", line 142, in forward
    return self._forward(self.depth, x)
  File "/home/john/Documents/Projects/Pytorch/pytorch_FAN-master/models/fan_model.py", line 137, in _forward
    up2 = F.interpolate(low3, scale_factor=2, mode='nearest')
  File "/home/john/Virtualenv/venv/lib/python3.5/site-packages/torch/nn/functional.py", line 2429, in interpolate
    return torch._C._nn.upsample_nearest2d(input, _output_size(2))

RuntimeError: CUDA out of memory. Tried to allocate 128.00 MiB (GPU 0; 3.95 GiB total capacity; 2.85 GiB already allocated; 123.81 MiB free; 103.31 MiB cached)

What's wrong?

xsacha commented 5 years ago

I think it was using up 5 or 6GB on my GV100, so in all likelihood it needs more memory. Try reducing batch size if you don't have any more cards.

Edit:

pytorch_FAN$ nvidia-smi
Tue May 21 09:09:31 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.56       Driver Version: 418.56       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro GV100        Off  | 00000000:01:00.0  On |                  Off |
| 52%   69C    P2   179W / 250W |   5454MiB / 32475MiB |     72%      Default |
+-------------------------------+----------------------+----------------------+