CStanKonrad / long_llama

LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.
Apache License 2.0
1.45k stars 85 forks source link

How much vram needed to finetune 3b model? Is 12gb enough? #15

Open universewill opened 1 year ago

universewill commented 1 year ago

How much vram needed to finetune 3b model? Is 12gb enough?

CStanKonrad commented 1 year ago

Unfortunately, 12GB is not enough to finetune the 3B model in the standard way (tuning all parameters). That is because of optimizer variables and values for gradient accumulation. This Hugging Face blog post briefly describes how much each of those parts contributes to VRAM usage. For our model, we have used a single A100 80GB GPU and usage metrics show that > 70GB of the GPU memory was allocated.