apoorvumang / prompt-lookup-decoding

436 stars 22 forks source link

Question about the correctness of HF PLD implementation #5

Closed Kevin-XiongC closed 4 months ago

Kevin-XiongC commented 4 months ago

[ENV] transformers 4.39.3 pytorch 2.1.2 cuda=11.8

Thanks for the work! I've try to use PLD in my deployment of LLama7B. However, I find that using argmax(temperture==0) leads to inconsistent decode results compare to output without PLD. image

Is the implementation on HF correct?

apoorvumang commented 4 months ago

I think there is a bug in latest transformers implementation

Could you please try downgrading transformers (eg to version 4.37) and retry? It would be extremely helpful!

Kevin-XiongC commented 4 months ago

I think there is a bug in latest transformers implementation

Could you please try downgrading transformers (eg to version 4.37) and retry? It would be extremely helpful!

Yes, after downgrading to 4.37. It works as expected. Thx.