chore: update llama.py - Githubissues

mit-han-lab / duo-attention

DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

MIT License

252 stars 10 forks source link

Closed eltociear closed 5 days ago

eltociear commented 1 week ago

continous -> continuous