Open d5423197 opened 1 month ago
Btw, I have confirmed this issue is realted to ConvLSTM2D layer. Because I have tested, if I just created the model before ConvLSTM2D layer added, the model can be initialized successfully. But if I added ConvLSTM2D layer, it will be failed.
import models as M
model = M.BCDU_net_D3(input_size=input_shape, traning=False)
spec = (tf.TensorSpec((1, 256, 256, 3), tf.float32, name="input"),)
model_proto, _ = tf2onnx.convert.from_keras(model, input_signature=spec, opset=13, output_path=out_path)
The code to get the onnx model. onnxruntime version: onnxruntime-gpu==1.12.0
Thanks for the updated ticket info. Could you mention your OS version just for reference? Also, have you tried running the onnx with https://github.com/NVIDIA/TensorRT/tree/main/samples/trtexec rather than ORT?
Hi @moraxu ,
No I have not tried trtexec. I am a python user.
OS version: Ubuntu 20.04
Oh, it's just the executable that's called like that, it can be run on Linux. As was mentioned in https://github.com/NVIDIA/TensorRT/issues/4109#issuecomment-2335112830, we'd like to be sure the issue can be isolated to TRT itself, rather than ORT. Do you have access to the instructions here: https://github.com/NVIDIA/TensorRT/tree/main/samples/trtexec ?
Can you run it like that:
./trtexec --onnx=model.onnx
on your model to confirm the issue persists? I'll file an internal bug then.
@moraxu I installed tensorrt using pip. (The instruction from official README). I tried to build it using only tensorrt. The same error. Please check it.
Do you mean the pip version of tensorrt is different from the executable trtexec?
Thanks, to clarify, trtexec is a standalone binary tool included with the TRT SDK (typically available when you install TRT using the tar or deb packages from NVIDIA). It helps with quick model conversion and testing, but it's separate from the pip version.
The version of TRT installed via pip should be the same as the version of trtexec, assuming they're from the same release, so the issue might be with TRT itself.
I tried to build it using only tensorrt.
Could you paste the full Python snippet here, on how you invoke the builder etc.? Apologies for the questions, I'd need that to file the bug.
import engine as eng
import argparse
from onnx import ModelProto
import tensorrt as trt
engine_name = "test_cseg"
onnx_path = "weights.120-0.12_fix_sim.onnx"
batch_size = 1
model = ModelProto()
with open(onnx_path, "rb") as f:
model.ParseFromString(f.read())
d0 = model.graph.input[0].type.tensor_type.shape.dim[1].dim_value
d1 = model.graph.input[0].type.tensor_type.shape.dim[2].dim_value
d2 = model.graph.input[0].type.tensor_type.shape.dim[3].dim_value
shape = [batch_size, d0, d1, d2]
engine = eng.build_engine(onnx_path, shape=shape)
eng.save_engine(engine, engine_name)
@moraxu
Thank you, I've instanced an internal bug, will let you know if more info is needed
@d5423197 I was asked if you can try to run the model with a newer 10.x TRT version?
This is a very obvious problem. This bug is realated to tensorflow ConvLSTM2D layer. Don't they know if they have made this layer compatible? @moraxu
@d5423197 but are you able to run this with a newer 10.x TRT version or are strictly limited to 8.5.1.7?
@moraxu For now, I am strictly limited to 8.5.1.7.
I see. The issue has been fixed in the upcoming 10.6 release, though.
@moraxu Thanks, may I ask about the specific cause of this problem?
A small issue in our vectorizer within our backend graph compiler.
Description
I tried to run model (onnx) through onnxruntime with TensorrtExecutionProvider. But the initialization is failed.
Error msg:
2024-09-09 10:58:29.082851313 [E:onnxruntime:Default, tensorrt_execution_provider.h:58 log] [2024-09-09 02:58:29 ERROR] [concatenationLayer.cpp::estimateOutputDims::110] Error Code 4: Internal Error ((Unnamed Layer* 73) [Concatenation]: all concat input tensors must have the same dimensions except on the concatenation axis (1), but dimensions mismatched at index 0. Input 0 shape: [2,64,64,256], Input 1 shape: [0,64,64,256])
Environment
TensorRT Version: TensorRT 8.5.1.7
NVIDIA GPU: A5000
NVIDIA Driver Version: 11.4
CUDA Version: 11.4
CUDNN Version:
Operating System:
Python Version (if applicable): 3.8.0
Tensorflow Version (if applicable): 2.8.0
PyTorch Version (if applicable): N/A
Baremetal or Container (if so, version): N/A
Relevant Files
Model link: https://github.com/rezazad68/BCDU-Net/blob/master/Lung%20Segmentation/models.py
Steps To Reproduce