lucidrains / lightweight-gan

Implementation of 'lightweight' GAN, proposed in ICLR 2021, in Pytorch. High resolution image generations that can be trained within a day or two
MIT License
1.63k stars 222 forks source link

CUDA out of memory error while generating interpolations #140

Open mertsaadet opened 1 year ago

mertsaadet commented 1 year ago

Hi everyone,

I'm trying to train my model on a custom dataset which is consisting of 380 images with 1024x1024 pixels. I'm using the following command to train so far:

lightweight_gan --data dataset/ --image-size 1024 --name remote --num-train-steps 100000 --batch-size 2

Since my GPU has only 4 gb of RAM the maximum batch-size I can use is 2 other than that it gives me CUDA out of memory error. However my issue is not about training because it works with batch-size 2. When I try to generate interpolations using my latest model(91k iterations):

lightweight_gan --name remote --generate-interpolation

I'm getting this error: torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 768.00 MiB (GPU 0; 3.82 GiB total capacity; 2.48 GiB already allocated; 98.81 MiB free; 2.61 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

My GPU is GeForce GTX 1650 Ti Mobile and I have tried export 'PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:128' to eliminate the error but it doesn't help.

Is there any suggestions to generate interpolations without getting this error? Also any feedback would be very helpful.