issues
search
microsoft
/
vattention
Dynamic Memory Management for Serving LLMs without PagedAttention
MIT License
219
stars
14
forks
source link
Add microbenchmark to profile kernel latency with different page sizes
#3
Closed
apanwariisc
closed
3 months ago