Question on draft process

SafeAILab / EAGLE

Official Implementation of EAGLE

Apache License 2.0

622 stars 59 forks source link

As long as the sampling probabilities are recorded, both sampling methods are correct. This is because speculative sampling does not have any requirements for the draft distribution q. Top-k sampling is equivalent to repeating argmax multiple times. When the draft distribution q and the target distribution p are highly consistent, sampling is better than argmax. For example, if p=(0.6,0.4) and q=(0.6,0.4), the acceptance rate of sampling from q is 1. When using argmax, it is equivalent to sampling from the distribution (1.0,0.0), and the acceptance rate is 0.6.

SafeAILab / EAGLE

Question on draft process #58