issues
search
microsoft
/
vattention
Dynamic Memory Management for Serving LLMs without PagedAttention
MIT License
182
stars
10
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
3090 Can I run this program?
#17
LiDaTaoTao
opened
2 weeks ago
1
Compatibility Issues with vattention on A100 and A30 GPUs with CUDA 12.5 and 12.3
#16
alvi75
opened
3 weeks ago
0
microbenchmarks/perf_pagesize/bench_pagesize.py
#15
alvi75
opened
3 weeks ago
2
CPU memory leaking?
#14
JasonHe-WQ
opened
1 month ago
0
why init_kvcache need vattention.reserve_physical_pages(GPU_MEM_RESERVE)
#13
dingzhiqiang
opened
1 month ago
1
sarathi-lean/sarathi/cache_ops.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c1021throwNullDataPtrErrorEv
#12
zhu1365377615
opened
1 month ago
7
Add OpenAI compatible API
#11
apanwariisc
closed
1 month ago
0
Update scripts and fix avoidable exceptions
#10
apanwariisc
closed
1 month ago
0
Is this the real repository?
#9
CSEEduanyu
opened
1 month ago
1
Update README files
#8
apanwariisc
closed
1 month ago
0
Add support for small page sizes in vattention
#7
apanwariisc
closed
1 month ago
0
Update README
#6
apanwariisc
closed
1 month ago
0
Add more microbenchmarks
#5
apanwariisc
closed
1 month ago
0
Add post-processing scripts
#4
apanwariisc
closed
1 month ago
0
Add microbenchmark to profile kernel latency with different page sizes
#3
apanwariisc
closed
1 month ago
0
changed dataset to arxive
#2
ramyaprabhu-alt
closed
1 month ago
1
Action required: migrate or opt-out of migration to GitHub inside Microsoft
#1
microsoft-github-policy-service[bot]
closed
1 month ago
6