what is the learning rate then finetuning all LLM parameters during SFT stage? - Githubissues

mbzuai-oryx / LLaVA-pp

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

751 stars 53 forks source link

what is the learning rate then finetuning all LLM parameters during SFT stage? #5

Closed xmu-xiaoma666 closed 2 months ago

xmu-xiaoma666 commented 2 months ago

What is the learning rate then finetuning all LLM parameters during SFT stage?

mmaaz60 commented 2 months ago

Hi @xmu-xiaoma666,

Thank you for your interest in our work. We use a learning rate of 2e-5 during full fine-tuning of both LLaMA-3 and Phi-3 based models. I hope it will help. Good Luck and let me know if you have any questions.

mmaaz60 commented 2 months ago

Hi @xmu-xiaoma666,

We just added the full finetuning script that will reproduce our reported results. Good Luck and let us know if you have any questions.