sail-sg / sdft

[ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".
https://arxiv.org/abs/2402.13669
56 stars 4 forks source link

question about training paradigm #5

Closed zhuang-li closed 2 days ago

zhuang-li commented 2 days ago

Hi, this is a very interesting work! One thing I don't understand is whether the self-distillation is rewriting using Llama2-chat and further fine-tuning Llama2-chat as well, or is it just fine-tuning Llama2?

rickyang1114 commented 2 days ago

Hello, thanks for your interest! 

We actually utilize Llama-2-chat throughout the whole process. 

---Original--- From: "Zhuang @.> Date: Sat, Jun 29, 2024 17:49 PM To: @.>; Cc: @.***>; Subject: [sail-sg/sdft] question about training paradigm (Issue #5)

Hi, this is a very interesting work! One thing I don't understand is whether the self-distillation is rewriting using Llama2-chat and further fine-tuning Llama2-chat as well, or is it just fine-tuning Llama2?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

zhuang-li commented 2 days ago

Thank you for your prompt response! Interestingly, we have a concurrent project that closely aligns with the rewriting technique you described. Although rewriting isn't our work's primary focus, we just applied this method to train our base model for some empirical study. But we did discover findings similar to yours. We've only recently become aware of your excellent prior work in this area! Our research explores the technique's application in a different context and offers additional analysis of its effectiveness. Hope it can help with your research as well:

https://arxiv.org/pdf/2406.10882

rickyang1114 commented 2 days ago

Thanks for bringing your work to our attention, which is solid and insightful from the perspective of example selection. I believe it would be helpful.