pentium3 / sys_reading

system paper reading notes
234 stars 12 forks source link

SiDA: Sparsity-Inspired Data-Aware Serving for Efficient and Scalable Large Mixture-of-Experts Models #360

Open pentium3 opened 6 months ago