Using EAGLE will slow down inference

SafeAILab / EAGLE

Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)

https://arxiv.org/pdf/2406.16858

Apache License 2.0

780 stars 79 forks source link

Using EAGLE will slow down inference #73

Closed zkqq closed 2 months ago

zkqq commented 4 months ago

Thank you very much for your work on EAGLE; it has been extremely helpful to me.

I have a question: why does downloading yuhuili/EAGLE-Vicuna-7B-v1.3 from Hugging Face and using it directly to accelerate lmsys/vicuna-7b-v1.3 result in a negative effect? However, using my own trained EAGLE head produces a speedup effect. Could you please tell me where I went wrong?

Below is a screenshot of my operation.

I would greatly appreciate any assistance you can provide in resolving this issue. Thank you very much.

cdliang11 commented 4 months ago

Maybe try temperature=0.

zkqq commented 4 months ago

Maybe try temperature=0.

Thank you very much for your valuable advice. However, I obtained the same result regardless of the temperature.

Liyuhui-12 commented 4 months ago

@zkqq The correct drafts will be displayed in yellow. I noticed that there are almost no yellow words in your image. You may not have correctly matched the draft model with the base model, or you did not set the --model-type parameter. Its default value is llama-2-chat, and it must be changed to vicuna.

zkqq commented 4 months ago

yuhuili/EAGLE-Vicuna-7B-v1.3

Thank you very much for your reply. You are correct; the issue likely stems from the mismatch between the EAGLE head and the origin model. However, I believe I have configured all necessary parameters, including the model type.

I trained an EAGLE head, ran webui.py and the evaluation, and observed a good acceleration effect. However, when switching back to the EAGLE head from yuhuili/EAGLE-Vicuna-7B-v1.3, there are negative impacts. Both config.json are identical, with the only difference being the _pytorchmodel.bin file.

Liyuhui-12 commented 4 months ago

No issues were encountered when using yuhuili/EAGLE-Vicuna-7B-v1.3, but there are issues with the weights you trained yourself?

zkqq commented 4 months ago

yuhuili/EAGLE-Vicuna-7B-v1.3,

On the contrary, there is no issue with utilizing the model weights that I have trained personally. However, employing the yuhuili/EAGLE-Vicuna-7B-v1.3 weights may result in adverse effects.

Liyuhui-12 commented 3 months ago

The possible reason is that the template or weights of your base model are different from those used when we trained the draft model.