Closed dearleiii closed 6 years ago
You can check with the following command if you are on linux system:
watch -n 1 'free -m'
Running large CNNs in CPU is especially memory demanding. If you have a GPU, use that with cuDNN instead.
Can not CUDA_VISIBLE_DEVICES=0 Can not export PATH Can not change /bin/bash Can not install htop
leichen@gpu-compute4$ python3 scatter_plots.py
cuda:2
0
Traceback (most recent call last):
File "scatter_plots.py", line 36, in
change leichen@gpu-compute4$ python3 scatter_plots.py cuda:0 0
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
File "/home/home2/leichen/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 176, in _apply
module._apply(fn)
File "/home/home2/leichen/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 182, in _apply
param.data = fn(param.data)
File "/home/home2/leichen/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 393, in
Exp4. generally speaking, the pattern is:
use .cuda() on any input batches/tensors
use .cuda() on your network module, which will hold your network, like:
class MyModel(nn.Module): def init(self): self.layer1 = nn. … self.layer2 = nn. … … etc …
then just do:
model = MyModel() model.cuda()
Basically it is just one line to use DataParallel:
net = torch.nn.DataParallel(model, device_ids=[0, 1, 2]) output = net(input_var) Just wrap your model with DataParallel and call the returned net on your data. The device_ids parameter specifies the used GPUs.
File "scatter_plots.py", line 134, in
trainNet(approximator, batch_size = 100, n_epochs = 5, learning_rate = 0.001)
File "scatter_plots.py", line 98, in trainNet
outputs = net(inputs)[0]
File "/home/home2/leichen/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, kwargs)
File "/home/home2/leichen/SuperResolutor/Approx_discrim/apxm.py", line 56, in forward
x = self.main(x)
File "/home/home2/leichen/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, *kwargs)
File "/home/home2/leichen/.local/lib/python3.5/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/home/home2/leichen/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(input, kwargs)
File "/home/home2/leichen/.local/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 301, in forward
self.padding, self.dilation, self.groups)
RuntimeError: $ Torch: not enough memory: you tried to allocate 12GB. Buy new RAM! at /pytorch/aten/src/TH/THGeneral.c:218
leichen@gpu-compute7>