[Feature]: Compatibility issues

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

https://docs.vllm.ai

Apache License 2.0

26.61k stars 3.9k forks source link

Open BlackHandsomeLee opened 5 months ago

BlackHandsomeLee commented 5 months ago

Can the vllm acceleration framework be compatible with tensorRT-LLM? here is doc about tensorRT-LLM： https://github.com/NVIDIA/TensorRT-LLM

No response

No response

simon-mo commented 5 months ago

Can you elaborate about which aspect of the compatibility you are interested in? The API/distribution/kernels?

BlackHandsomeLee commented 5 months ago

Can you elaborate about which aspect of the compatibility you are interested in? The API/distribution/kernels?

can vllm offline inference be compatible with run local execution tensorRT-LLM engine？

simon-mo commented 5 months ago

Do you mean outputting the same? vLLM offline inference can run models end to end with performance on-par with TensorRT-LLM.