SafeAILab / EAGLE

Official Implementation of EAGLE
https://arxiv.org/pdf/2406.16858
Apache License 2.0
622 stars 59 forks source link

Question about sampling #74

Closed Monoclinic closed 3 days ago

Monoclinic commented 1 month ago

Hello, I read the code and have questions here: https://github.com/SafeAILab/EAGLE/blob/cbc73dcb88bb7541c8f9a0f11f2468ec68c523b6/model/cnets.py#L718 It seems that the probs of sampled tokens from draft model are not original softmax prob, instead, it looks like a conditional prob (the prob of top2 is calculated without top1, and similar for top3, top4...) While the prob of base model is calculated directly by softmax: https://github.com/SafeAILab/EAGLE/blob/cbc73dcb88bb7541c8f9a0f11f2468ec68c523b6/model/utils.py#L376 I wonder why the probs of draft model is processed by a cumsum. Will it bring some misalignment between the probs of draft and base model?

Liyuhui-12 commented 1 month ago

This is because EAGLE performs non-replacement sampling during the draft stage, which requires adjusting the distribution.

Monoclinic commented 1 month ago

Hello, thanks for your reply. I understand the non-replacement sampling requires an adjustment, but I'm confused about (top-k with original softmax prob & sampling with prob adjustment) which one is correct on math, namely being consistent with LLM. Or both are correct? BTW after prob adjustment, the probs of draft model is higher than top-k prob, which would bring a slightly higher rejection rate?

Liyuhui-12 commented 3 days ago

Hello, what we need is the actual distribution from which each draft token is sampled, so adjustments are necessary.

Monoclinic commented 3 days ago

Hello, thanks for your reply. I've just read EAGLE-2 and I found the speedup rate could be over 4x. However in the paper the maximum depth of the tree in expand stage is 4, therefore the theoretical upperbound of speedup rate is 4x, is there anything wrong?

Liyuhui-12 commented 3 days ago

The depth of the draft tree is 6 (see the appendix). Are you referring to the number 4 based on Figure 7 in our paper? It is a schematic diagram and does not represent the actual tree size.

Monoclinic commented 3 days ago

Got that. Thank you!