triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.4k stars 1.49k forks source link

version inconsistency:Tensorrt and Triton images #7613

Closed chenchunhui97 closed 2 months ago

chenchunhui97 commented 2 months ago

Description I compiled tensorrt engine of a cv model from onnx file, and I want to use the engine in triton server but encountered error. It seems because of the inconsistency between tensorrt and tritonserver images.

E0912 02:00:56.166591 1 logging.cc:40] 1: [stdArchiveReader.cpp::StdArchiveReaderInitCommon::46] Error Code 1: Serialization (Serialization assertion stdVersionRead == serializationVersion failed.Version tag does not match. Note: Current Version: 236, Serialized Engine Version: 239)
I0912 02:00:56.167397 1 tensorrt.cc:274] TRITONBACKEND_ModelFinalize: delete model state
E0912 02:00:56.167417 1 model_lifecycle.cc:630] failed to load 'det' version 1: Internal: unable to load plan file to auto complete config: /models/det/1/model.plan

Triton Information

the engine is compiled using tensorrt of version 10.4.0 by pip install tensorrt. official container: tritonserver:24.01-py3

Are you using the Triton container or did you build it yourself? use official container

To Reproduce Steps to reproduce the behavior.

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).

Expected behavior I want to know the tensorrt version I should choose, and do you have a version table to indicate the version compatibility between tritonserver containers and tensorrt?

chenchunhui97 commented 2 months ago

tensorrt v8.6.1 is ok. I found a version matrix.