full parameter finetuning on A100 40G

Alpha-VLLM / LLaMA2-Accessory

An Open-source Toolkit for LLM Development

https://llama2-accessory.readthedocs.io/

Other

2.68k stars 170 forks source link

full parameter finetuning on A100 40G #91

Closed ZhenYangIACAS closed 10 months ago

ZhenYangIACAS commented 10 months ago

Hi, can the code of the current version be full-parameter finetuning on A100 40G ? I have tried, but find the GPU memory only supports 10 layers finetuning. If ok, can you show me the right setting?

ChrisLiu6 commented 10 months ago

How many GPUs do you have? More GPUs will reduce the burden on each individual. Furthermore, the model size and the sequence length are all influential to the conclusion.

ZhenYangIACAS commented 10 months ago

8 GPUs, max_words = 512, 7B

ChrisLiu6 commented 10 months ago

That should be enough. You may try the following practices if you meet OOM problem:

reduce the batch size
turn on graidient checkpointing (add --checkpointing to the main_fiunetune.py starting command)
change data parallel strategy from sdp to fsdp