hemingkx / Spec-Bench

Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)
https://sites.google.com/view/spec-bench
Apache License 2.0
166 stars 16 forks source link

accuracy of next-token and next-next-token #8

Closed Dbxwz closed 4 months ago

Dbxwz commented 4 months ago

Thanks for your great work! I think draft token acceptance rate has a great impact on the speedup ratio,but I didn't find any draft token acceptance rate comparison of different methods.
Hoping for an update of Readme.md, similar to what is depicted in the figure below.

截屏2024-05-22 14 40 13
hemingkx commented 4 months ago

Thanks for your issue!

We plan to support the statistics of Token acceptance rate in the near future, as noted in our roadmap. The implementation presents two main challenges:

  1. Draft Length Variability: Each method has a different draft length, requiring specific modifications and examination of the source code for each method. This brings a considerable workload.
  2. Token Tree Drafts: As mentioned in the Eagle paper, token acceptance rate statistics are less applicable for token tree drafts because multiple tokens are sampled per location with only one accepted. Therefore, an appropriate evaluation framework should be designed to unify the statistical evaluation of various methods.

We are working on this and will provide updates soon. Stay tuned!