james-oldfield / muMoE

[NeurIPS'24] Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization
http://james-oldfield.github.io/muMoE
21 stars 1 forks source link