Zefan-Cai / PyramidKV

The Official Implementation of PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
https://arxiv.org/pdf/2406.02069
MIT License
486 stars 44 forks source link

Mistral 7B full kv cache out of memory #11

Closed monster119120 closed 3 months ago

monster119120 commented 3 months ago

请问论文里怎么在longbench上测的mistral7B的full kv版本呀,我一跑就OOM了。。。

Zefan-Cai commented 3 months ago

我们用的是80G的卡,在40G的卡上确实会OOM。您是用多少显存的卡?

monster119120 commented 3 months ago

我们用的是80G的卡,在40G的卡上确实会OOM。您是用多少显存的卡?

我也是80G的A100呀。。

Zefan-Cai commented 3 months ago

现在github上有多卡代码了,你用2张A100跑fullKV试试。不过我当时是一张A100就能跑mistral的fullKV了

monster119120 commented 3 months ago

现在github上有多卡代码了,你用2张A100跑fullKV试试。不过我当时是一张A100就能跑mistral的fullKV了

感谢大佬更新代码!单卡A100 80G FullKV已经完全没问题了!

Zefan-Cai commented 3 months ago

可以的,请问你是怎么解决的?

monster119120 commented 3 months ago

可以的,请问你是怎么解决的?

我自己新写了个算法,然后把mistral的prepare_input换成transformers官方的就行啦。