Closed k0286 closed 1 month ago
@k0286 Ideally you should be able to build container image with vLLM and Torch backend.
Triton supports numerous backends which can lead to a large number of combinations - container images. Additionally, vLLM dependency is large in size so in order to avoid further increasing the image size vLLM container image does not carry any other backends.
However, for your use case, you can start with Triton container image with pytorch backend and install vLLM backend in it. See the instructions here on how to build this image.
Ty, I'll give it a try!
Is your feature request related to a problem? Please describe. I have some torch models on TIS, now I want to add a LLM model.
And I notice that TIS supports the vllm. But there are no Triton images on NGC can support both vllm and torch backend.
Describe the solution you'd like Provide the Triton image support both vllm and torch backend.
Describe alternatives you've considered Describe the reason why can not support vllm and torch backend at same time.