hao-ai-lab / LookaheadDecoding

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
https://arxiv.org/abs/2402.02057
Apache License 2.0
1.15k stars 67 forks source link

Any analysis on the impact on accuracy #16

Closed qizzzh closed 1 year ago

qizzzh commented 1 year ago

Just curious if any analysis done on accuracy impact

Viol2000 commented 1 year ago

We do not change output distribution. Theoretically, the output results should be literally the same as the huggingface's greedy search outputs. In empirical experiments, sometimes results differ from huggingface's greedy search when using FP16. We owe it to floating point errors, and the accuracy is not dropped. When we use FP32, the output is exactly the same as huggingface's greedy search.

qizzzh commented 1 year ago

Thank you that makes sense.