Support for TPU hardware

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

https://docs.vllm.ai

Apache License 2.0

22.33k stars 3.15k forks source link

Open alboimDor opened 4 months ago

alboimDor commented 4 months ago

I've noticed the roadmap for 2024Q1 includes adding support for Google's TPU hardware. Is this issue still being considered?

Thank you

DavidPeleg6 commented 4 months ago

chujiezheng commented 3 months ago

qeternity commented 3 months ago

I would expect so. Not clear why there hasn't been more demand for it. This chart from Google last year benchmarking TPUv4 is pretty sexy.

rick-c-goog commented 6 days ago