vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
26.51k stars 3.88k forks source link

Virtual Office Hours: August 8 and August 21 #7657

Open mgoin opened 3 weeks ago

mgoin commented 3 weeks ago

vLLM Virtual Open Office Hours

We enjoyed seeing everyone at the previous office hours and got great feedback. These office hours are a virtual bi-weekly live event where you come to learn more about the vLLM project, how to contribute, and get help with your issues - with special topics and guests along the way.

Sign up here: https://neuralmagic.com/community-office-hours/ You can watch previous sessions on the YouTube playlist.

Dates:

If there are any themes or topics you would like to see addressed, please comment below. We look forward to seeing you there!

Previous office hour issue: https://github.com/vllm-project/vllm/issues/5937

lev-channel commented 3 weeks ago

Hi, I'm highly interested in running vLLM on Google cloud TPU VMs, especially like TPUv4 pod slices like TPUv4-32. is there any plan this office hour uploaded to youtube? :)

sparsh35 commented 2 weeks ago

Sorry just saw it know , I had a question about the performance benchmark on TPUs, like what are the best practices does quantization leads to more throughput.