Open lxnlxnlxnlxnlxn opened 2 months ago
I was working on KV Cache offloading and recomputation. I wandoring whether DeepSpeedMII has implemented offloading and recomputation technique, if so, where is the API?
I was working on KV Cache offloading and recomputation. I wandoring whether DeepSpeedMII has implemented offloading and recomputation technique, if so, where is the API?