Closed tjtanaa closed 7 months ago
me too. 我准备使用中文sft数据训练一下Mixtral Eagle,并测试下在中文任务下的加速效果。
We will upload the training code for Mixtral in a few days. Please stay tuned. Thanks for your interest.
The current weights were trained using Mixtral_8x7B.json. We have not optimized for MoE yet, so the same training code (main.py) was used. We are planning to optimize for MoE.
Hi, I am interested in training Mixtral Eagle. Could I know will there be plans to release the training code anytime soon?