microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.1k stars 2.84k forks source link

Symbolic Shape infer fails on onnx file without much logs #21120

Open aaditya-srivathsan opened 2 months ago

aaditya-srivathsan commented 2 months ago

Describe the issue

I have a DINO model with a SWIN backbone that i want to try using certain TRT optimizations on the onnx model itself. I am trying to use the symbolic shape infer since the first effort gave me this error

| UNAVAILABLE: Internal: onnx runtime error 6: Exception during initialization: /workspace/onnxruntime/onnxruntime/core/providers/tensorrt/tensorr |
|                |         | t_execution_provider.cc:1387 SubGraphCollection_t onnxruntime::TensorrtExecutionProvider::GetSupportedList(SubGraphCollection_t, int, int, const |
|                |         |  onnxruntime::GraphViewer&, bool*) const [ONNXRuntimeError] : 1 : FAIL : TensorRT input: value has no shape specified. Please run shape inferenc |
|                |         | e on the onnx model first. Details can be found in [NVIDIA - TensorRT](https://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html#shape-inferen)  |
|                |         | ce-for-tensorrt-subgraphs

After running the shape infer command here, the conversion fails without much indication as to why

Potential unsafe merge between symbolic expressions: (ceiling(Pad_1747_o0__d2/2 - 1/2)*ceiling(Pad_1747_o0__d3/2 - 1/2),floor(Pad_1747_o0__d2/2)*floor(Pad_1747_o0__d3/2))
Potential unsafe merge between symbolic expressions: (floor(Pad_1747_o0__d2/2)*floor(Pad_1747_o0__d3/2))
Potential unsafe merge between symbolic expressions: (ceiling(ConstantOfShape_2097_o0__d2/2 - 1/2)*ceiling(floor(Pad_1747_o0__d3/2)/2 - 1/2),floor(ConstantOfShape_2097_o0__d2/2)*floor(floor(Pad_1747_o0__d3/2)/2))
Potential unsafe merge between symbolic expressions: (floor(ConstantOfShape_2097_o0__d2/2)*floor(floor(Pad_1747_o0__d3/2)/2))
Potential unsafe merge between symbolic expressions: (ceiling(Pad_6562_o0__d2/2 - 1/2)*ceiling(Pad_6562_o0__d3/2 - 1/2),floor(Pad_6562_o0__d2/2)*floor(Pad_6562_o0__d3/2))
Potential unsafe merge between symbolic expressions: (floor(Pad_6562_o0__d2/2)*floor(Pad_6562_o0__d3/2))
Potential unsafe merge between symbolic expressions: (Reshape_7877_o0__d2 + Reshape_7901_o0__d2 + Reshape_7925_o0__d2 + Reshape_7949_o0__d2 + Reshape_7973_o0__d2,Reshape_7882_o0__d2 + Reshape_7906_o0__d2 + Reshape_7930_o0__d2 + Reshape_7954_o0__d2 + Reshape_7978_o0__d2)
Potential unsafe merge between symbolic expressions: (Reshape_7877_o0__d2 + Reshape_7901_o0__d2 + Reshape_7925_o0__d2 + Reshape_7949_o0__d2 + Reshape_7973_o0__d2)
Traceback (most recent call last):
  File "/home/onnxruntime/onnxruntime/python/tools/symbolic_shape_infer.py", line 2757, in <module>
    out_mp = SymbolicShapeInference.infer_shapes(
  File "/home/onnxruntime/onnxruntime/python/tools/symbolic_shape_infer.py", line 2693, in infer_shapes
    raise Exception("Incomplete symbolic shape inference")
Exception: Incomplete symbolic shape inference

To reproduce

Cannot share the onnx model due to the confidential nature

Urgency

Very urgent

Platform

Linux

OS Version

Ubuntu 20.07

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.15.1

ONNX Runtime API

Python

Architecture

X64

Execution Provider

TensorRT

Execution Provider Library Version

CUDA

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

yf711 commented 1 month ago

Have you tried using latest script in the main branch? (btw, using script with --verbose could show more log) If that's not helpful, could you share your model to repro this issue?