Open kelkarn opened 2 weeks ago
I also noticed these Execution Provider requirements listed on the ONNX runtime webpage:
Based on these, it looks like the TensorRTExecutionProvider
and CUDAExecutionProvider
with ONNX runtime 1.15.1 require CUDA 11.8 and TensorRT 8.6. Whereas the tensorrt:24.08-py3
Docker image I am using to convert it to a TRT plan comes with CUDA 12.6 and TensorRT 10.3. Because of this, I am not able to load the ONNX model within the Docker image within a Python InferenceSession
with these execution providers either.
Does this mean that getting the ONNX model converted to a TRT plan with the latest TRT version 10.3 but a custom op built with ONNX runtime 1.15.1 is just not going to be possible? Or is there a way to achieve this?
This error is followed by a bunch of errors on the Unsqueeze node like so
No, it due to grid_sampler.
Try to
export LD_LIBRARY_PATH=/opt/tritonserver/backends/onnxruntime/libmmdeploy_onnxruntime_ops.so:$LD_LIBRARY_PATH
then rerun.
@lix19937 - that did not work. I see the same error:
[09/26/2024-17:35:33] [V] [TRT] Static check for parsing node: grid_sampler_8539 [grid_sampler]
[09/26/2024-17:35:33] [I] [TRT] No checker registered for op: grid_sampler. Attempting to check as plugin.
[09/26/2024-17:35:33] [V] [TRT] Local registry did not find grid_sampler creator. Will try parent registry if enabled.
[09/26/2024-17:35:33] [E] [TRT] IPluginRegistry::getCreator: Error Code 4: API Usage Error (Cannot find plugin: grid_sampler, version: 1, namespace:.)
Another WBR, you can use torch.nn.functional.grid_sample to repalce the mm version, now grid sample is a built-in layer in trt8.6, so you donot load plugin. @kelkarn
And please make sure you are using the latest Opset version 17 to export onnx.
I see the following error when I run my trtexec command:
from within the container:
This error is followed by a bunch of errors on the
Unsqueeze
node like so:The model here is a DINO model in ONNX format converted to ONNX using MMDeploy, and a custom op. The custom op symbol in
libmmdeploy_onnxruntime_ops.so
useslibonnxruntime.so.1.15.1
which I have also copied into the Docker container, and added to myLD_LIBRARY_PATH
. I am using thenvcr.io/nvidia/tensorrt:24.08-py3
Docker image and thetrtexec
binary built with TensorRT 10.3.I found this other similar issue: https://github.com/onnx/onnx-tensorrt/issues/800
The conclusion there was, that TensorRT does not support the
Round
operation yet. Is that the same conclusion here? I.e. thegrid_sampler
operation is not supported in TensorRT yet? There's an issue for this too that I found (Issue#2612) that was marked 'Closed', but it looks like my issue is the exact same here.