jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Apache License 2.0
7.71k stars 453 forks source link

TinyMix (Not Issue) #122

Closed nivibilla closed 9 months ago

nivibilla commented 9 months ago

Thanks for this amazing project. Using the experimental Mixtral branch of mergekit, I was able to MoE-ify the chat version of TinyLlama to make TinyMix-8x1b-chat

Since it's rumoured that Mixtral was trained by upscaling mistral, tinymix is my attempt at the starting point of making a. Scaled down version of Mixtral.

Once again, appreciate the work on this project and looking forward to what's next!

ChaosCodes commented 9 months ago

Thank you for your interest. Since Mixtral has been release, we are also interested in building a TinyLlama MoE, which seems an awesome work. Good luck to you!