SafeAILab / EAGLE

Official Implementation of EAGLE
https://arxiv.org/pdf/2406.16858
Apache License 2.0
622 stars 59 forks source link

Questions about pre-allocated kv_cache. #59

Closed Hevans123 closed 2 months ago

Hevans123 commented 3 months ago

Thanks for your great repo. I want to know if the function [initialize_past_key_values()] must be used. If I do not pre-allocate kv_cache, will the acceleration effects be worse?

Liyuhui-12 commented 2 months ago

I want to know if the function [initialize_past_key_values()] must be used.

It's not mandatory. The pre-allocated kv_cache is not used in modeling_eagle.py.

If I do not pre-allocate kv_cache, will the acceleration effects be worse?

The speedup ratio will not decrease because the pre-allocated kv_cache is used in the target model, which makes both the baseline (vanilla autoregressive) and EAGLE faster. The absolute value of the generation speed will decrease.