issues
search
InternLM
/
lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
https://lmdeploy.readthedocs.io/en/latest/
Apache License 2.0
3.15k
stars
281
forks
source link
fix falcon attention
#1761
Closed
grimoire
closed
2 weeks ago
grimoire
commented
3 weeks ago
falcon-7b has 71 heads, leads to attention kernel error.
falcon-7b has 71 heads, leads to attention kernel error.