microsoft / vattention

Dynamic Memory Management for Serving LLMs without PagedAttention
MIT License
219 stars 14 forks source link

Compatibility Issues with vattention on A100 and A30 GPUs with CUDA 12.5 and 12.3 #16

Open alvi75 opened 2 months ago

alvi75 commented 2 months ago

I'm encountering several issues while trying to compile and run the vattention library on both NVIDIA A100 and A30 GPUs with different CUDA versions. The problems seem to arise due to compatibility issues between the vattention code and the specific GPU architectures.

System Configuration:

Additional Notes:

Request:

Could you please provide guidance on how to resolve these compatibility issues? If the library is currently incompatible with A100 or A30 GPUs, would there be any upcoming updates to address these issues?

Also I was able to run the benchmark scripts.