Open Lotayou opened 6 years ago
@Lotayou, I tried this... In trainer.py I replaced any declaration of Variable like Variable(x) with Variable(x, volatile=True) See if it works for you.
@hanzhanggit @Lotayou - did you resolve this issue ??
@shirishr any fix for this ??
Traceback (most recent call last):
File "main.py", line 146, in
Reduce the BATCH_SIZE in cfg/eval_birds.yml to generate images without running out of memory.
Reduce the BATCH_SIZE in cfg/eval_birds.yml to generate images without running out of memory.
I reduced the BATCH_SIZE to 9 and could work with a 4 GB GPU (I also turned variables in volatile). See if that works for you
I try to reproduce the code but get stuck in cuda out of memory error when loading Inception-v3 model.
I tried both on a Windows 10 PC with Nvidia 1060X graphic card (6G) and a Linux server with Nvidia Geforce Titan Graphic card (12G). But both time I ran out of memory with the following message:
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1513368888240/work/torch/lib/THC/generic/THCStorage.cu line=58 error=2 : out of memory Traceback (most recent call last): File "main.py", line 144, in <module> algo.train() File "/backup1/lingboyang/StackGANv2/code/trainer.py", line 666, in train self.inception_model, start_count = load_network(self.gpus) File "/backup1/lingboyang/StackGANv2/code/trainer.py", line 126, in load_network netsD[i] = torch.nn.DataParallel(netsD[i], device_ids=gpus) File "/home/vcl/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 59, in __init__ self.module.cuda(device_ids[0]) File "/home/vcl/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 216, in cuda return self._apply(lambda t: t.cuda(device)) File "/home/vcl/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 146, in _apply module._apply(fn) File "/home/vcl/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 146, in _apply module._apply(fn) File "/home/vcl/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 152, in _apply param.data = fn(param.data) File "/home/vcl/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 216, in <lambda> return self._apply(lambda t: t.cuda(device)) File "/home/vcl/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/_utils.py", line 69, in _cuda return new_type(self.size()).copy_(self, async) RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1513368888240/work/torch/lib/THC/generic/THCStorage.cu:58
Is this normal? @hanzhanggit Could you tell me what's the minimal hardware requirement to run this program? Is there any way to save graphic memory? Thanks!