microsoft / DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Apache License 2.0
1.91k stars 175 forks source link

How is the prompt segmentation specifically implemented for Dynamic SplitFuse? Is there any code implement or code snippet ? #462

Open wenyangchou opened 7 months ago