Closed jasonngap1 closed 3 months ago
As the error prompt mentioned triton-models-1 | [TensorRT-LLM][ERROR] 6: The engine plan file is not compatible with this version of TensorRT, expecting library version 10.0.1.6 got 10.1.0.27
, please rebuild the engine with commit 9691e12bce7ae1c126c435a049eb516eb119486c.
As the error prompt mentioned
triton-models-1 | [TensorRT-LLM][ERROR] 6: The engine plan file is not compatible with this version of TensorRT, expecting library version 10.0.1.6 got 10.1.0.27
, please rebuild the engine with commit 9691e12.
Thank you for pointing this out. The commit mentioned seemed to be the latest commit which uses tensorrt version 10.1.0
, but the triton server expects an older version 10.0.1
although I have built the server with the latest TensorRT-LLM. Is it possible to trace which commit i should build the engine from with tensorrt version 10.0.1
, please?
As the description of https://github.com/NVIDIA/TensorRT-LLM/pull/1835 mentioned, the latest commit upgrade TRT to 10.1. So please try this one 2a115dae84f13daaa54727534daa837c534eceb4 which is the parent commit of HEAD.
As the description of #1835 mentioned, the latest commit upgrade TRT to 10.1. So please try this one 2a115da which is the parent commit of HEAD.
Thank you @nv-guomingz this resolved my issue. Closing this issue on my end.
System Info
Who can help?
@kaiyux @byshiue
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
I would expect the tensorrt engine to work with the triton inference server
actual behavior
additional notes
Both triton server and tensorrt engine built have the same TensorRT-LLM version at commit
2a115dae84f13daaa54727534daa837c534eceb4
Model used: Llama3-ChatQA-1.5-8B