microsoft / DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Apache License 2.0
1.91k stars 175 forks source link

Question About Offloading and Recomputation #523

Open lxnlxnlxnlxnlxn opened 2 months ago

lxnlxnlxnlxnlxn commented 2 months ago

I was working on KV Cache offloading and recomputation. I wandoring whether DeepSpeedMII has implemented offloading and recomputation technique, if so, where is the API?

DeepSpeedMII不支持offload