SafeAILab / EAGLE

Official Implementation of EAGLE
https://arxiv.org/pdf/2406.16858
Apache License 2.0
622 stars 59 forks source link

About KVCache of Eagle vs Origin LLM #61

Closed jcao-ai closed 2 months ago

jcao-ai commented 2 months ago

Hi thanks for this great project.

I have a question: since the EagleModel has its own KVCache or past_key_values, there must be some difference when the feature of input (feature + token_embedding) is from original LLM or EagleModel.

From this picture, we can say that

  1. feature[make] and feature[help] in the first forward pass lead to unbiased inference because the feature[I] is from the original LLM space, so the past_key_values generated is also unbiased;
  2. feature[with] and feature[you] in the second forward pass is biased because the input feature[help] is from Eagle space, but not the original LLM. This also applies for the past_key_values for this generation step.

Am I right about the two conclusions above ? If true, then will it be also true that as the number of generation increases (like max_new_tokens=1000), the accumulated error in past_key_values will be big due to autogressive biased input