tomaarsen / attention_sinks

Extend existing LLMs way beyond the original training length with constant memory usage, without retraining
https://huggingface.co/blog/tomaarsen/attention-sinks
Apache License 2.0
650 stars 41 forks source link

chatglm3 support? #40

Open ScottishFold007 opened 7 months ago

ScottishFold007 commented 7 months ago

Your project is really great, can you add support for chatglm3? https://huggingface.co/THUDM/chatglm3-6b/blob/main/modeling_chatglm.py