Hi, thanks for you great work! When I used the EAGLE-llama2-chat-7B you provided for testing, the average acceptance length I measured was lower than the value in the paper. The way I obtained it was to get all the accept_lengths and divide them by the total number of inferences, and finally add 1, which is the token of the last sample of the large model.
Hi, thanks for you great work! When I used the EAGLE-llama2-chat-7B you provided for testing, the average acceptance length I measured was lower than the value in the paper. The way I obtained it was to get all the accept_lengths and divide them by the total number of inferences, and finally add 1, which is the token of the last sample of the large model.