triton-inference-server / tensorrtllm_backend

The Triton TensorRT-LLM Backend
Apache License 2.0
581 stars 81 forks source link

Can you provide an example of a visual language model or multimodal model launch by triton server? #463

Open lzcchl opened 1 month ago

lzcchl commented 1 month ago

there is an example https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/qwenvl , but I have no idea how can I use this model in triton server, Can you provide an example of a visual language model or multimodal model?

byshiue commented 3 weeks ago

The multimodal example on backend is on going because the backend of multimodal is very different to pure decoder (like GPU)

XiaoYu2022 commented 2 weeks ago

Have you found a demonstration program or solution? I want to deploy the quantized Baichuan large model using Triton, based on the example provided by TensorRT-LLM, but I'm still missing some clues.

byshiue commented 2 weeks ago

Have you found a demonstration program or solution? I want to deploy the quantized Baichuan large model using Triton, based on the example provided by TensorRT-LLM, but I'm still missing some clues.

If you only want to use Baichuan model (only decoder), it should works and you can refer the documents like baichuan.md. If you want to use multimodal, it is still on going.

lzcchl commented 2 weeks ago

as byshiue said, the multimodal example on backend is on going. If you want to use a multimodal model in the Triton infer server framework right now, please use python-backend as an transitional solution.

XiaoYu2022 commented 2 weeks ago

Thank you very much. I have now completed the quantization of the Baichuan2-13B model. However, I am still unclear about the part of using Triton to deploy the model and obtain the inference interface, specifically how to access the inference interface provided by Triton externally. I noticed that the official Triton image has been updated, but I am still unsure about how to use the image to expose the interface for external access. Are there any relevant materials available? Best wishes.

byshiue commented 2 weeks ago

Could you share what concrete issue do you encounter when you try following the documents here and here?