hao-ai-lab / LookaheadDecoding

Apache License 2.0
1.04k stars 63 forks source link

Related work: Prompt lookup decoding #45

Open shermansiu opened 5 months ago

shermansiu commented 5 months ago

https://github.com/apoorvumang/prompt-lookup-decoding

This method was recently merged into Huggingface transformers and also uses n-grams (found in the input prompt) to accelerate decoding.

learning-chip commented 5 months ago

Interesting, could you point to the merged PR? Does it support batching?

This method has a similar idea (copy from input, no Jacobi): https://github.com/alipay/PainlessInferenceAcceleration

shermansiu commented 5 months ago

Here's the PR: https://github.com/huggingface/transformers/pull/27775

From a cursory glance at the PR, it seems like it supports batching.

dongxiaolong commented 5 months ago

Here's the PR: huggingface/transformers#27775

From a cursory glance at the PR, it seems like it supports batching.

I have also noticed these two methods. Do you know the specific difference between them?

shermansiu commented 5 months ago

Lookahead decoding takes the n-grams from prior lookahead decoding steps /Jacobi trajectories. Prompt lookup decoding takes the n-grams from the prompt.

learning-chip commented 5 months ago

it seems like it supports batching.

It doesn't :/ https://github.com/huggingface/transformers/pull/27775#issuecomment-1901225695

shermansiu commented 5 months ago

Interesting. As the comment also suggests, it seems like PLD can support batching in theory - it's just the implementation that doesn't support it.

jivanph commented 5 months ago

Lookahead was mentioned here https://github.com/SafeAILab/EAGLE