NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.68k stars 2.12k forks source link

FillVectorsFromInput Starts must be a 1-D array #1764

Open aleksandarilic95 opened 2 years ago

aleksandarilic95 commented 2 years ago

Description

Let me just preface that this is not a ONNX Surgeon bug, but probably something on my part that I don't know how to solve. Also, since I've been searching the internet for the past few days, this seems like the best place to ask the question. I'm trying to remove ArrayFeatureExtractor layer from my network and replace it with the combination of Split and Slice layers. ONNX Surgeon part goes without errors and I get my new network with input "starts" and "ends" being a 1-D array for the Slice layer. Later, when I run it through onnx runtime API, I get the following output:

2022-01-27 13:44:57.527266048 [W:onnxruntime:, graph.cc:122 MergeShapeInfo] Error merging shape info for output. 'split_out_1' source:{,1} target:{1}. Falling back to lenient merge. 2022-01-27 13:44:57.527293714 [W:onnxruntime:, graph.cc:122 MergeShapeInfo] Error merging shape info for output. 'split_out_2' source:{,1} target:{1}. Falling back to lenient merge. 2022-01-27 13:44:57.527305561 [W:onnxruntime:, graph.cc:122 MergeShapeInfo] Error merging shape info for output. 'split_out_3' source:{,1} target:{1}. Falling back to lenient merge. 2022-01-27 13:44:57.536553595 [E:onnxruntime:, sequential_executor.cc:346 Execute] Non-zero status code returned while running Slice node. Name:'' Status Message: slice.cc:153 FillVectorsFromInput Starts must be a 1-D array Traceback (most recent call last): File "inference.py", line 13, in outputs = ort_sess.run(None, {'float_input': x}) File "/home/alex/anaconda3/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 192, in run return self._sess.run(output_names, input_feed, run_options) onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running Slice node. Name:'' Status Message: slice.cc:153 FillVectorsFromInput Starts must be a 1-D array

I'm posting both of my networks, my python code for onnx surgeon and my python code for simple inference.

Screenshot from 2022-01-27 14-00-17

Screenshot from 2022-01-27 14-00-23

import onnx_graphsurgeon as gs
import onnx
import numpy as np

graph = gs.import_onnx(onnx.load("knn-fixed.onnx"))

inputs_array = []

for node in graph.nodes:
    if node.op == "ArrayFeatureExtractor":
        inputs_array.append(node.inputs)

class_tensor = inputs_array[0][0]

flatten_node = [node for node in graph.nodes if node.op == "Flatten"][0]

split_shape = np.array([0]).shape

split_out_1 = gs.Variable("split_out_1", dtype = np.int64, shape = split_shape)
split_out_2 = gs.Variable("split_out_2", dtype = np.int64, shape = split_shape)
split_out_3 = gs.Variable("split_out_3", dtype = np.int64, shape = split_shape)
split_node = gs.Node(op = "Split", inputs = flatten_node.outputs, outputs = [split_out_1, split_out_2, split_out_3])
split_node.attrs["axis"] = 1
split_node.attrs["split"] = [1, 1, 1]

slice_out_1 = gs.Variable("slice_out_1", dtype = np.int64)
slice_out_2 = gs.Variable("slice_out_2", dtype = np.int64)
slice_out_3 = gs.Variable("slice_out_3", dtype = np.int64)

slice_node_1 = gs.Node(op = "Slice", inputs = [class_tensor, split_out_1, split_out_1], outputs = [slice_out_1])
slice_node_2 = gs.Node(op = "Slice", inputs = [class_tensor, split_out_2, split_out_2], outputs = [slice_out_2])
slice_node_3 = gs.Node(op = "Slice", inputs = [class_tensor, split_out_3, split_out_3], outputs = [slice_out_3])

concat_out = gs.Variable("concat_out", dtype = np.int64)
concat_node = gs.Node(op = "Concat", inputs = [slice_out_1, slice_out_2, slice_out_3], outputs = [concat_out])
concat_node.attrs["axis"] = 0

graph.outputs = concat_node.outputs

graph.nodes.append(split_node)

graph.nodes.append(slice_node_1)
graph.nodes.append(slice_node_2)
graph.nodes.append(slice_node_3)

graph.nodes.append(concat_node)

graph.cleanup().toposort()

onnx.save(gs.export_onnx(graph), "knn-fixed-2.onnx")

Code for simple inference:

import onnx
import pandas as pd
import numpy as np

df = pd.read_csv("../some_path/keypoints.csv")
test = df.iloc[0].drop(['item', 'class'])
test = np.array(test).astype(np.float32).reshape(-1, 266)

import onnxruntime as ort
x = test
y = df.iloc[0]['class']
ort_sess = ort.InferenceSession('knn-fixed-2.onnx')
outputs = ort_sess.run(None, {'float_input': x})

print(f"Predicted: {outputs}")

Again, I understand how this may not be an appropriate place to ask the question, but I couldn't find any forum dedicated to this and stackoverflow doesn't seem to be the best place when it comes to ONNX Surgeon since it's kind of a niche question. If it goes against rules, I understand if you delete it, but any help is more than welcome!

pranavm-nvidia commented 2 years ago

Could you share your model as well?

Looking at just the script and screenshots, my best guess is that you need a Squeeze right after the Split node

aleksandarilic95 commented 2 years ago

I'll try with Squeeze after Split, but I'm not sure if links are allowed here and github doesn't allow .onnx files to be uploaded, so I'll report my findings.

pranavm-nvidia commented 2 years ago

I think most people post their models via Google Drive/DropBox

aleksandarilic95 commented 2 years ago

https://drive.google.com/file/d/1qQSQ9A5uyMPKm3xnG7s85uYinXaCXx-2/view?usp=sharing

Here's the link for the model before Squeeze. Adding Squeeze doesn't change anything, still the same error.

Here's the image of the network after adding Squeeze:

Screenshot from 2022-01-27 16-53-44

Here's the model after adding Squeeze:

https://drive.google.com/file/d/1QOQD7kcV1iMkylqOr0YVrgrBKbm2emKq/view?usp=sharing

pranavm-nvidia commented 2 years ago

I think you were just missing the axes attribute. Here's a script to fix that:

import onnx
import onnx_graphsurgeon as gs

graph = gs.import_onnx(onnx.load("knn-fixed-2.onnx"))

for node in graph.nodes:
    if node.op == "Squeeze":
        node.attrs["axes"] = [1]

onnx.save(gs.export_onnx(graph), "knn-fixed-3.onnx")

After that, I'm able to run the model with ONNX-RT:

[I] onnxrt-runner-N0-01/27/22-08:08:29  | Activating and starting inference
[I] onnxrt-runner-N0-01/27/22-08:08:29 
    ---- Inference Input(s) ----
    {float_input [dtype=float32, shape=(1, 266)]}
[I] onnxrt-runner-N0-01/27/22-08:08:29 
    ---- Inference Output(s) ----
    {concat_out [dtype=int64, shape=(0,)]}
[I] onnxrt-runner-N0-01/27/22-08:08:29  | Completed 1 iteration(s) in 44.71 ms | Average inference time: 44.71 ms.
[I] PASSED | Command: polygraphy run knn-fixed-3.onnx --onnxrt

As an aside, you may also want to look into using ONNX-GS's register/layer APIs to do your replacement as that might simplify your code quite a bit. There are a couple examples here and here if you want to take a look.

aleksandarilic95 commented 2 years ago

You were right for the "axes" missing and after adding axes attribute the onnxrt inference goes correct, however the same model fails on tensorRT parsing with

ERROR: builtin_op_importers.cpp:3013 In function importSlice:
[8] Assertion failed: starts.size() == axes.size()

Also, sorry for commenting on closed issue, I've closed it by accident and don't know how to reopen it.

pranavm-nvidia commented 2 years ago

Looks like it's failing because of the input shape, which is (0, 266). If I fix it to a positive value, I can get a little further:

polygraphy surgeon sanitize knn-fixed-3.onnx -o folded.onnx \
    --override-input-shapes float_input:[1,266]

But TRT still can't handle the model because the Top-K output is being used as a shape tensor and TRT doesn't currently support data-dependent shapes:

[E] In node 9 (parseGraph): INVALID_NODE: Invalid Node - node_of_slice_out_1
    [optimizer/common/graph/graph.cpp::computeInputExecutionUses::556] Error Code 9: Internal Error (To_TopK: ITopKLayer cannot be used to compute a shape tensor)
[!] Could not parse ONNX correctly

Looking at your original model, I wonder if you could implement the ArrayFeatureExtractor as a Gather? Based on the (very sparse) description here it seems like the same thing and Gather would avoid the need for this Split/Slice pattern.

aleksandarilic95 commented 2 years ago

Trying with gather and fixed input size, I do get a bit further, but now I get myelin errors while building/parsing.

 myelin::ir::operation_t::replace_def(myelin::ir::tensor_t*, size_t): Assertion `idx < out_tensors().size()' failed.
Aborted (core dumped)

Screenshot from 2022-02-04 10-39-07

Here's the model: https://drive.google.com/file/d/1rLGdvbyGClhUrwLJXGGbWD-80cR1Dpds/view?usp=sharing

EDIT: Again, onnx runtime inferences just fine.

pranavm-nvidia commented 2 years ago

I'm seeing the same thing. I've filed a bug internally and I'll let you know if we can find a work-around in the meantime