kyegomez / MoE-Mamba

Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Zeta
https://discord.gg/GYbXvDGevY
MIT License
84 stars 5 forks source link

Is Class SwitchMixtureOfExperts unused in main model? #6

Closed lunaaa95 closed 1 month ago

lunaaa95 commented 4 months ago

I noticed that there are 2 version of class MoE in the repo. One is in model.py, named SwitchMoE, which is used in MambaMoE. While another MoE is in block.py, named SwitchMixtureOfExperts, which is not used in the model MambaMoE. Whats the purpose of that and whats the difference?

Upvote & Fund

Fund with Polar

github-actions[bot] commented 4 months ago

Hello there, thank you for opening an Issue ! 🙏🏻 The team was notified and they will get back to you asap.

github-actions[bot] commented 2 months ago

Stale issue message