CUDA OOM training on 4 24GB VRAM GPUs with batch size 1

timothybrooks / instruct-pix2pix

Other

6.1k stars 527 forks source link

CUDA OOM training on 4 24GB VRAM GPUs with batch size 1 #84

Closed styfeng closed 1 year ago

styfeng commented 1 year ago

See title. This seems a bit unreasonable to me. I wonder if it's an issue with the script because I doubt 96gb of vram would not be enough to train on batch size of 1 (images are dimensions 256x256)... if anybody has gotten the training working, let me know!

styfeng commented 1 year ago

Update: i just tried using an A40 (40GB of VRAM) and it trains fine with batch size up to 64. it seems the model by default consumes around 26gb of VRAM or more (I believe), so distributing across 24GB GPUs doesn't work