SafeAILab / EAGLE

Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)
https://arxiv.org/pdf/2406.16858
Apache License 2.0
780 stars 79 forks source link

Can I change the choices tree as I want?? #9

Closed je1lee closed 9 months ago

je1lee commented 9 months ago

Could it be possible to change the choices.py and make another tree architecture ?? or is there any way I could try without tree related decoding process? and also do you have any result about how many token is verified from draft tokens for each basemodel verification?

Liyuhui-12 commented 9 months ago

Could it be possible to change choices.py and make another tree architecture ??

Of course, you can refer to #6

Is there any way I could try without tree related decoding process?

A chain structure is a special type of tree, and [[0],[0,0]...] represents a chain structure.

Do you have any result about how many token is verified from draft tokens for each basemodel verification?

If you use UI inference, the 'compression ratio' box in the top right corner displays what you are looking for.

This is the experimental result on MT-bench.

Model Compression Ratio Model Compression Ratio
Vicuna 7B 3.94 LLaMA2-Chat 7B 3.62
Vicuna 13B 3.98 LLaMA2-Chat 13B 3.90
Vicuna 33B 3.68 LLaMA2-Chat 70B 3.80
je1lee commented 9 months ago

Thanks for reply!! I think it works with good acceleration rate even without the tree decoding and very impressed by the fact that the single decoder could perform this far. Do you have any academical references which backgrounds your architectural choice(single decoder layer) of the draft model?? What mentioned in Blog seems a little ambiguous to me

Liyuhui-12 commented 9 months ago

Thank you for your interest. We are currently writing a paper of EAGLE, discussing its structure and other issues. Once the paper is completed, I will post the link here.

je1lee commented 9 months ago

Thanks for reply! Then can I ask some more about the blog??

스크린샷 2023-12-18 오후 3 42 47
  1. does simple at line 10 is typo for sample?? if it is, why does x is sampled from q not from p??

  2. What does n stands for? every token composing tree? would n be 10 at the case of Figure 3 tree shape??

Liyuhui-12 commented 9 months ago
  1. Thank you very much! It is a typo, x is sampled from p.

  2. n represents the number of child nodes of the current node. For Figure 3, if the current node is I, then n=2, x1 is "may", and x2 is "help".