mchong6 / JoJoGAN

Official PyTorch repo for JoJoGAN: One Shot Face Stylization
MIT License
1.42k stars 206 forks source link

Cuda run out of memory #26

Closed andriken closed 2 years ago

andriken commented 2 years ago

(jojo) (jojo) PS C:\Users\Admin\Documents\JoJoGAN> python train_custom_style.py --model_name sophie --alpha 0.0 --preserve_color False --num_iter 300 --device cuda 0%| | 0/300 [00:02<?, ?it/s] Traceback (most recent call last): File "train_custom_style.py", line 103, in <module> fake_feat = discriminator(img) File "C:\Users\Admin\MiniConda3\envs\jojo\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Admin\Documents\JoJoGAN\model.py", line 665, in forward out = block(out) File "C:\Users\Admin\MiniConda3\envs\jojo\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Admin\Documents\JoJoGAN\model.py", line 621, in forward out = self.conv2(out) File "C:\Users\Admin\MiniConda3\envs\jojo\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Admin\MiniConda3\envs\jojo\lib\site-packages\torch\nn\modules\container.py", line 141, in forward input = module(input) File "C:\Users\Admin\MiniConda3\envs\jojo\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Admin\Documents\JoJoGAN\model.py", line 126, in forward padding=self.padding, File "C:\Users\Admin\Documents\JoJoGAN\op\conv2d_gradfix.py", line 32, in conv2d ).apply(input, weight, bias) File "C:\Users\Admin\Documents\JoJoGAN\op\conv2d_gradfix.py", line 138, in forward out = F.conv2d(input=input, weight=weight, bias=bias, **common_kwargs) RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 8.00 GiB total capacity; 4.95 GiB already allocated; 0 bytes free; 5.49 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Is there any way to lower the batch size? how to do that and I don't have any Idea whether that can work, but please can you still fix this problem.

mchong6 commented 2 years ago

If you are already using only one reference, your GPU have too little memory. The batch size is already 1 so you cant lower it anymore. One thing you could try is to use 256 resolution LPIPS loss to compute the loss instead of using the discriminator. Otherwise there isn't much you can do.