thunlp / InfLLM

The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"
MIT License
269 stars 21 forks source link

Refactor, add description and qwen support #10

Closed guyan364 closed 5 months ago

guyan364 commented 5 months ago

  1. 重构 Context Manager 和 Triton Attention,加速推理
  2. 增加 Config 说明
  3. 支持 Qwen 系列模型

  4. Refactor Context Manager and Triton Attention to accelerate inference
  5. Add Config description
  6. Support Qwen series models

close #3 #9