issues
search
mit-han-lab
/
duo-attention
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
MIT License
252
stars
10
forks
source link
chore: update llama.py
#1
Closed
eltociear
closed
5 days ago
eltociear
commented
1 week ago
continous -> continuous
continous -> continuous