vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
22.33k stars 3.15k forks source link

Support for TPU hardware #2835

Open alboimDor opened 4 months ago

alboimDor commented 4 months ago

I've noticed the roadmap for 2024Q1 includes adding support for Google's TPU hardware. Is this issue still being considered?

Thank you

DavidPeleg6 commented 4 months ago

+1

chujiezheng commented 3 months ago

+1

qeternity commented 3 months ago

I would expect so. Not clear why there hasn't been more demand for it. This chart from Google last year benchmarking TPUv4 is pretty sexy.

image

rick-c-goog commented 6 days ago

+1