hao-ai-lab / LookaheadDecoding

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
https://arxiv.org/abs/2402.02057
Apache License 2.0
1.15k stars 67 forks source link

Adaptive LEVEL/GUESS size? #51

Open sahel-sh opened 8 months ago

sahel-sh commented 8 months ago

Great Work! Looking at the code my understanding is that even though LEVEL/GUESS_SIZE is configurable, it is fixed for the entire inference for a given input prompt. I was wondering if you have looked into dynamically changing this based on the max_hits value? or increasing it in some sort of scheduled manner based on the number of output tokens generated?