Bug in Greedy Search - Githubissues

hao-ai-lab / LookaheadDecoding

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

https://arxiv.org/abs/2402.02057

Apache License 2.0

1.15k stars 67 forks source link

Open david-wei-01001 opened 6 months ago

david-wei-01001 commented 6 months ago

in lade/decoding/py line1175 the original code:

else:
                all_old_tokens.append(hits[max_hit])

should be changed to

else:
                all_old_tokens.append(hits[hit_idx])

Otherwise it is just appending the same thing