Open Abe13 opened 7 months ago
Hello @Abe13 thanks for raising this issue! Yes, there's seems to be a discrepancy / regression that occurred during the porting of our internal codebase and we're currently working on tracking it down. See this issue for related discussion: https://github.com/huggingface/alignment-handbook/issues/45
You claim that "In practice, we find comparable performance for both full and LoRA fine-tuning, with the latter having the advantage of producing small adapter weights that are fast to upload and download from the Hugging Face Hub."
However, when I try the Lora model DPO-aligned LLM that you have trained, alignment-handbook/zephyr-7b-dpo-lora, I experience a total performance degradation. Here is an example of model output that seems confused:![image](https://github.com/huggingface/alignment-handbook/assets/3280518/1c5eae99-9641-469a-bb73-b66a26a594d4)
Even the training loss indicates that the model has not learned much
Here is the training loss for the full model DPO alignment.![image](https://github.com/huggingface/alignment-handbook/assets/3280518/902aaf32-0446-4ab1-8e38-28afcd456fed)
Would you please do a clarification? Is my observation different from what you have experienced?
Thanks