Xirider / finetune-gpt2xl

Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpeed
MIT License
427 stars 73 forks source link

Out of memory with RTX3090 #19

Open PyxAI opened 2 years ago

PyxAI commented 2 years ago

Hi, I'm trying to train gpt2xl, but keep getting OOM, even when I set batch size to 1 and gradient_accumulation to 8\16\512, contigous_gradients false and allgather_bucket_size \ reduce_bucket_size 2e2. I can see in nvidia-smi that I'm only reaching half the memory capacity - around 12GB My system is as stated - 3090 with 24GB memory 80 GB RAM 5600x cpu if that matters running WSL2 on windows 10 Thanks.

PyxAI commented 2 years ago

So working with WSL is just a no-go I installed dual boot ubuntu and now the problem disappeared

BrandonKoerner commented 2 years ago

dual boot only, huh... that sucks. I was really hoping I could use this on win10 or wsl(2)

PyxAI commented 2 years ago

I was however, able to run the model under WSL2 windows 11 Didn't check training, it's worth a shot @ReewassSquared

uahmad235 commented 2 years ago

Hi @PyxAI . Which ubuntu version did you run this code on?