Typos in readme - Githubissues

mit-han-lab / duo-attention

DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

MIT License

252 stars 10 forks source link

Closed gaotianyu1350 closed 6 days ago

gaotianyu1350 commented 6 days ago

Great work! There is a typo in the README

enable_duo_attention_eval(
    model,
    attn_heads,
    num_recent_tokens=64,
    num_sink_tokens=256,
)

num_recent_tokens -> sink_size, num_sink_tokens -> recent_size

Guangxuan-Xiao commented 6 days ago

Fixed. Thank you!