Support Chinese Task - Githubissues

SafeAILab / EAGLE

Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)

https://arxiv.org/pdf/2406.16858

Apache License 2.0

826 stars 81 forks source link

Support Chinese Task #16

Closed Ishiki-Iroha closed 11 months ago

Ishiki-Iroha commented 11 months ago

We tested a small number of Chinese tasks(about 50 tasks) on Vicuna (7b, 13b) and found that the acceleration ratio of Chinese tasks was lower than that of English tasks. Is this in line with expectations? Here are some results:

vicuna-7b	vicuna-13b
Baseline(tokens/s)	39.63	23.13
Eagle(tokens/s)	65.59	42.27
Ratio	1.66	1.83

hongyanz commented 11 months ago

We use ShareGPT dataset to train EAGLE. Here is the description of the ShareGPT dataset: Removing excessive unicode (indicative of Chinese or Korean text, usually). Therefore, acceleration on Chinese task is totally out-of-distribution.

If you are interested in Chinese task, we would suggest using Chinese corpus to train EAGLE.