issues
search
james-oldfield
/
muMoE
[NeurIPS'24] Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization
http://james-oldfield.github.io/muMoE
21
stars
1
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Is it suitable for autoregressive model?
#1
jiangsongtao
closed
5 months ago
4