Closed lyc728 closed 8 months ago
Hi @lyc, trt_llm container is a container exclusively with TensorRT-LLM. If you would like to use a pythorch and onnxruntime, please use our base container: nvcr.io/nvidia/tritonserver:23.11-py3
Alternatively, you can build a container with trtllm and other backends with build.py and all required backends:
--backend=tensorrtllm --backend=python --backend=onnxruntime --backend=pytorch
Please, refer to documentation here: https://github.com/triton-inference-server/server/blob/main/docs/customization_guide/build.md#building-with-docker
I will close this issue for now, feel free to reach out with any questions.
--backend=onnxruntime --backend=pytorch
tritonserver --model-repository=/models/liuyuanchao/tensorrtllm_backend/triton_model_repo --backend-config=onnxruntime --backend-config=pytorch
it comes error
Please use --backend
flag, not --backend-config
Please, also use build steps provided in the docs, I linked. To build the container, you need to call build.py
script outside of any containers. For example:
./build.py -v --no-container-interactive --enable-logging --enable-stats --enable-tracing \
--enable-metrics --enable-gpu-metrics --enable-cpu-metrics \
--filesystem=gcs --filesystem=s3 --filesystem=azure_storage \
--endpoint=http --endpoint=grpc --endpoint=sagemaker --endpoint=vertex-ai \
--backend=ensemble --enable-gpu --endpoint=http --endpoint=grpc \
--backend=tensorrtllm\
--backend=python --backend=onnxruntime
Note that this is different from running tritonserver
. Steps above will build you a custom container with all specified backends (through backend
tag)
UNAVAILABLE: Invalid argument: unable to find 'libtriton_pytorch.so' UNAVAILABLE: Invalid argument: unable to find 'libtriton_onnxruntime.so' I pull the image llm_trt (nvcr.io/nvidia/tritonserver:23.11-trtllm-python-py3) but the image not contain onnxruntime and pytorch