Script for Full Fine-Tuning of Mixtral

databricks / megablocks

Apache License 2.0

1.22k stars 175 forks source link

Script for Full Fine-Tuning of Mixtral #68

Open alpayariyak opened 11 months ago

alpayariyak commented 11 months ago

Hi, I see that there is a script for training Mixtral, but not one for fine-tuning. Could you please provide it? The whole community is having a lot of issues with getting correct full fine-tuning to work, including both our team at OpenChat and the teams at Nous Research, Axolotl and more. This would be incredibly helpful

tgale96 commented 10 months ago

Hi! Sorry for the delay!

We discussed this a bit in https://github.com/stanford-futuredata/megablocks/issues/59 a well - We do not have a script for fine-tuning Mixtral, unfortunately. Could you elaborate on the issues you're seeing trying to fine-tune with other setups?