Closed xmu-xiaoma666 closed 6 months ago
Hi @xmu-xiaoma666,
Thank you for your interest in our work. We use a learning rate of 2e-5
during full fine-tuning of both LLaMA-3 and Phi-3 based models. I hope it will help. Good Luck and let me know if you have any questions.
Hi @xmu-xiaoma666,
We just added the full finetuning script that will reproduce our reported results. Good Luck and let us know if you have any questions.
What is the learning rate then finetuning all LLM parameters during SFT stage?