Is the rejection and adjusting probability implementation different from normal speculative sampling?

SafeAILab / EAGLE

Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)

https://arxiv.org/pdf/2406.16858

Apache License 2.0

780 stars 79 forks source link

Is the rejection and adjusting probability implementation different from normal speculative sampling? #20

Closed AlvL1225 closed 8 months ago

AlvL1225 commented 9 months ago

However, in other implementations: like GPT-fast, or lucidrains implementation, the probability (GTP - Q )should be subtracted elementwisely but not only the rejected element?

Liyuhui-12 commented 9 months ago

Thanks! The correct approach should be to subtract the two distributions rather than adjust the value of the rejected elements. We have already adjusted the non-greedy code. Since sampling without replacement is performed here, a mask is used to adjust the draft distribution. The rest is consistent with the code in the screenshot you provided.

All the experimental results we provided were under the greedy decoding setting and are not affected.