Oryx-mllm / Oryx

MLLM for On-Demand Spatial-Temporal Understanding at Arbitrary Resolution
https://oryx-mllm.github.io
291 stars 14 forks source link

How much GPU memory is needed to fine-tune the 7B model using LoRA? #3

Closed Gary-code closed 2 months ago

Gary-code commented 2 months ago

How much GPU memory is needed to fine-tune the 7B model using LoRA?

liuzuyan commented 2 months ago

Hi, thanks for your interest in our work! We have not tried the LoRA finetuning before. For reference, we train Oyrx-7B for full fine-tuning with NVIDIA A100 40G GPU, deepspeed zero3, 16k context length, batch size per device=2. Therefore, 40B is sufficient for LoRA fine-tuning, and you can also try on 24GB GPUs with a smaller batch size.