pentium3 / sys_reading

system paper reading notes
235 stars 12 forks source link

DeepSpeed-Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale #349

Open pentium3 opened 8 months ago

pentium3 commented 8 months ago

https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10046087