Infini-AI-Lab / Sequoia

scalable and robust tree-based speculative decoding algorithm
282 stars 29 forks source link

Estimate the number of generated tokens per step from the acceptance-rate-vector? #14

Open KexinFeng opened 3 months ago

KexinFeng commented 3 months ago

Hi,

If I understand the tree_search algorithm right, the dynamic programming process should be able to find the optimal number of generated tokens according to the acceptance-rate-vector. Also, given the acceptance-rate-vector and the candidate tree, the number of generated tokens can also be computed. But this is just theory. In the paper, the number of generated tokens are measured with experimenting runs. I'm wondering if these experimental-measured generated token numbers agree with the theoretical optimal generated token number?

I was trying to verify it, but in the repo, there is only tree_maps, while the acceptance vectors are missing. I'm wondering if you have considered this estimation before. Or, could you share the acceptance vectors, so that, along with the corresponding trees, I can quickly verify it?

Thanks!

dreaming-panda commented 2 months ago

we plan to refactor the repo and will share the files. Thank you!