SmerkyG / gptcore

Fast modular code to create and train cutting edge LLMs
Apache License 2.0
62 stars 9 forks source link

Hello, can you add the deepseek-moe component? #3

Open win10ogod opened 7 months ago

win10ogod commented 7 months ago

Hello, can you add the deepseek-moe component? https://arxiv.org/abs/2401.06066 I want to train Mixture-of-Expert RWKV as well as Mixture-of-Expert gptalpha.

SmerkyG commented 7 months ago

I'm currently experimenting with MoE variations, and will add support once I'm further along on that.