pentium3 / sys_reading

system paper reading notes
235 stars 12 forks source link

SiDA: Sparsity-Inspired Data-Aware Serving for Efficient and Scalable Large Mixture-of-Experts Models #360

Open pentium3 opened 8 months ago