abertsch72 / unlimiformer

Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"
MIT License
1.05k stars 77 forks source link

Unused variable `q_embed` in the Llama's `preprocess_query` method #31

Closed seunghyukoh closed 12 months ago

seunghyukoh commented 1 year ago

Hi, while reviewing the UnlimiformerLLaMa class, I found an unused variable in the preprocess_query method.

The variable q_embed looks quite important since it's related to Llama's rotary embedding. Is this intentional?

Thanks!

urialon commented 12 months ago

Hi @jake-seunghyukoh , Thank you for your interest in our work and for reporting this!

This unused variable practically lead the model to use position==0 for the query. This was fixed in https://github.com/abertsch72/unlimiformer/commit/60b4316d524e19b52ebba3954af75d6fa07b84a9

Thanks again, let us know if you have any more questions!

Best, Uri