Open conway-abacus opened 2 months ago
It should be because your trt version are different in two docker images, could you check it?
Thanks @byshiue do you mean the docker used to build the engine?
>>> import tensorrt
>>> tensorrt.__version__
'9.3.0.post12.dev1'
I was following the guide, should I try to downgrad/rebuild or upgrade in the server docker?
You could check the TRT version to run the triton. You can upgrade the TRT version of triton docker image to 9.3, or downgrade the TRT version of building engine to 9.2.
You can upgrade the TRT version of triton docker image to 9.3
can you help how to upgrade the trt version of triton docker image to 9.3, source building?
Hi @conway-abacus, could you try doing everything (both engine building, and starting Triton) in this image: nvcr.io/nvidia/tritonserver:24.04-trtllm-python-py3
? This should help align the versions for building and runtime to TRTLLM v0.9.0.
System Info
nvcr.io/nvidia/tritonserver:24.02-trtllm-python-py3
Who can help?
@kaiyux @byshiue
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
I'm not able to successfully launch the triton server for a quantized Mixtral model according to readme instructions (using tag v0.9.0 for both
tensorrtllm_backend
andTensorRT-LLM
,nvcr.io/nvidia/tritonserver:24.02-trtllm-python-py3
as advised here)I was able to build the engine and run the
run.py
script fromTensorRT-LLM
repro to produce reasonable results, but including the steps for completeness.Then when trying to launch the triton server I did
and received the error.
Expected behavior
The
launch_triton_server.py
script should launch the server successfullyactual behavior
The
launch_triton_server.py
script shows the following erroradditional notes
Despite the error message says
expecting library version 9.2.0.5 got 9.3.0.1
here is the contents of/usr/local/tensorrt/include/NvInferVersion.h
Also according to this
The dependent TensorRT version is updated to 9.3