SafeAILab / EAGLE

Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)
https://arxiv.org/pdf/2406.16858
Apache License 2.0
780 stars 79 forks source link

The number of candidate nodes and the maximum prediction length setting problem in the candidate tree #6

Closed shifeiwen closed 9 months ago

shifeiwen commented 9 months ago

Thank you for your excellent work. I have read your code and have some questions about candidate tokens. In the code, I saw that the number of nodes in the designed tree is 26, which means that the length of seq_len in the decode layer is 26 each time. I don’t know if this 26 has any special meaning or is the best result after an experiment, because on some edge devices or when the computing power is insufficient, the long seq_len length and the low hit rate will cause performance degradation, so I Wondering if I could reduce the length of some candidate words, or set this parameter as a hyperparameter. Hope I can get some of your thoughts. @Liyuhui-12

shifeiwen commented 9 months ago

in the function that calculates whether to match, why is the value of random.random used to determine whether the match is successful?

Liyuhui-12 commented 9 months ago

Thank you for your interest.

Which means that the length of seq_len in the decode layer is 26 each time.

26 refers to the number of nodes in the generation tree. We haven’t fine-tuned the structure and length of the generation tree; it's based more on intuition: branches for high-probability tokens should be deeper. Using a smaller tree might be more effective on edge devices.

I Wondering if I could reduce the length of some candidate words, or set this parameter as a hyperparameter.

You can reduce the length of some candidate words. The structure of the tree (which determines seq_len) is a hyperparameter, and you can easily adjust it. This hyperparameter is located in model/choices.py. The list mc_sim_7b_63 represents the structure of the tree, where each sublist corresponds to a node in the tree. For example, considering the query "I", [0] corresponds to the most probable token "am" following “I”, and [0,1] corresponds to the second most probable token "grateful" following "I am".

Liyuhui-12 commented 9 months ago

In the function that calculates whether to match, why is the value of random.random used to determine whether the match is successful?

You can find the pseudocode corresponding to this part of the code in our blog. It actually involves recursively using speculative sampling. The guarantee of speculative sampling being distribution invariant can be found in Appendix A of paper.

If you have any further questions, please feel free to continue commenting.

shifeiwen commented 9 months ago

Thanks for your reply, I will do some experiments