microbenchmarks/perf_pagesize/bench_pagesize.py

microsoft / vattention

Dynamic Memory Management for Serving LLMs without PagedAttention

MIT License

248 stars 16 forks source link

Open alvi75 opened 3 months ago

alvi75 commented 3 months ago

u64 do_cuda_uvm_init(int, u64): Assertion `page_size == 64*KB || page_size == 128*KB || page_size == 256*KB' failed.
Aborted (core dumped)

apanwariisc commented 3 months ago

what value are you passing for model_block_size?

alvi75 commented 3 months ago

sorry for late response, I just created new issu with detail explanation