issues
search
microsoft
/
vattention
Dynamic Memory Management for Serving LLMs without PagedAttention
MIT License
248
stars
16
forks
source link
Add microbenchmark to profile kernel latency with different page sizes
#3
Closed
apanwariisc
closed
4 months ago