microsoft / vattention

Dynamic Memory Management for Serving LLMs without PagedAttention
MIT License
185 stars 11 forks source link

Update scripts and fix avoidable exceptions #10

Closed apanwariisc closed 1 month ago

apanwariisc commented 1 month ago
  1. Round up the size of virtual memory allocation to prevent avoidable exceptions
  2. Minor updates to benchmarking scripts