HabanaAI / vllm-fork

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
39 stars 48 forks source link

Revert "Contiguous PA" #432

Closed madamczykhabana closed 2 days ago

madamczykhabana commented 2 days ago

Reverts HabanaAI/vllm-fork#424