YanSte / NLP-LLM-Fine-tuning-Llame-2-QLoRA-2024

Natural Language Processing (NLP) and Large Language Models (LLM) with Fine-Tuning LLM QLoRA and Llama 2 in 2024
https://www.kaggle.com/yannicksteph/nlp-llm-fine-tuning-2024-llama-2-qlora/
6 stars 2 forks source link

More RL fine-tuning examples for LLMs (PPO & DPO) #1

Open JhonDan1999 opened 4 months ago

JhonDan1999 commented 4 months ago

Great repo! can you add example notebooks on using PPO and DPO for RL fine-tuning of LLMs with SFTs.

Thanks

YanSte commented 4 months ago

Thanks Yes, I will work on it soon. I'll ping you when it's done.