hao-ai-lab / LookaheadDecoding

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
https://arxiv.org/abs/2402.02057
Apache License 2.0
1.11k stars 65 forks source link

Why the tup needs to be added to the tail of token_map when it found in the token_map? #34

Closed kevinoldching closed 9 months ago

kevinoldching commented 9 months ago

Why the tup needs to be added to the tail of token_map when it found in the token_map? if tup in token_map[past_tokens[0][i - 1]]: token_map[past_tokens[0][i - 1]].remove(tup) token_map[past_tokens[0][i - 1]].append(tup)

https://github.com/hao-ai-lab/LookaheadDecoding/blob/973edc8e30837bc6af0b7f0176dd8c609132af68/lade/decoding.py#L375C1-L377C69

hsm1997 commented 9 months ago

see here https://github.com/hao-ai-lab/LookaheadDecoding/blob/main/lade/decoding.py#L382 that the first element of token_map[past_tokens[0][i - 1]] is discarded when it reaches full size (GUESS_SET_SIZE), so I guess that the author uses the "add-to-tail" operation to keep the "newest" n-gram "tup"s

kevinoldching commented 9 months ago

see here https://github.com/hao-ai-lab/LookaheadDecoding/blob/main/lade/decoding.py#L382 that the first element of token_map[past_tokens[0][i - 1]] is discarded when it reaches full size (GUESS_SET_SIZE), so I guess that the author uses the "add-to-tail" operation to keep the "newest" n-gram "tup"s

Yes, I think you are right. Thanks.