sdan / selfextend

an implementation of Self-Extend, to expand the context window via grouped attention
https://arxiv.org/pdf/2401.01325.pdf
Apache License 2.0
115 stars 2 forks source link

Long input series makes oom #4

Open seanxuu opened 6 months ago

seanxuu commented 6 months ago

Great job! However, I have encountered an issue. When the length of the input increases, the GPU memory consumption grows rapidly, and it quickly leads to an out-of-memory (OOM) error. Could you please let me know if this is a bug?