timlee0212 SiDA-MoE issues - Githubissues

timlee0212 / SiDA-MoE

Code for MLSys 2024 Paper "SiDA-MoE: Sparsity-Inspired Data-Aware Serving for Efficient and Scalable Large Mixture-of-Experts Models"

MIT License

8 stars 4 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

In the NLG context, where the decoding phase is neccessary, will you predict and load predicted experts in every iteration? I am curious about this since you merely mentioned the task of classification.

#2 victayria77 opened 3 weeks ago
1
Failed to load dataset for finetune.py

#1 Tonanguyxiro closed 2 months ago
2