llama3 finetune time - Githubissues

xiaoachen98 / Open-LLaVA-NeXT

An open-source implementation for training LLaVA-NeXT.

398 stars 20 forks source link

llama3 finetune time #11

Closed Xiaohui9607 closed 5 months ago

Xiaohui9607 commented 5 months ago

Hi, I am finetuning llama3-8b on the dataset you provide, with 4x8 A100s, I wonder how long does it take on your side to finish the experiment? When finetuning vicuna-7b it only took me 12 hours, but llama3-8b took me 42 hours. Is this the same case on your end? thanks

xiaoachen98 commented 5 months ago

Hi, I am finetuning llama3-8b on the dataset you provide, with 4x8 A100s, I wonder how long does it take on your side to finish the experiment? When finetuning vicuna-7b it only took me 12 hours, but llama3-8b took me 42 hours. Is this the same case on your end? thanks

For LLaMA3, I set the max_length to 6144. You can set this arg to 4096 back to reduce the training time.