kyegomez / MoE-Mamba

Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Zeta
https://discord.gg/GYbXvDGevY
MIT License
72 stars 2 forks source link

Is Class SwitchMixtureOfExperts unused in main model? #6

Open lunaaa95 opened 1 month ago

lunaaa95 commented 1 month ago

I noticed that there are 2 version of class MoE in the repo. One is in model.py, named SwitchMoE, which is used in MambaMoE. While another MoE is in block.py, named SwitchMixtureOfExperts, which is not used in the model MambaMoE. Whats the purpose of that and whats the difference?

Upvote & Fund

Fund with Polar

github-actions[bot] commented 1 month ago

Hello there, thank you for opening an Issue ! 🙏🏻 The team was notified and they will get back to you asap.