index: 1 Got: 3 Expected: 416 index: 3 Got: 416 Expected: 3 Please fix either the inputs or the model.

MuhammadAsadJaved commented 3 years ago

Describe the bug I used the original Yolov3 example and it can run successfully, Then I am using my own Yolov3 (it takes two inputs, visual and infrared image) and I got this error.

Urgency As early as possible

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04
ONNX Runtime installed from (source or binary): Binary using
ONNX Runtime version: '1.5.2'
Python version: 3.5
Visual Studio version (if applicable):
GCC/Compiler version (if compiling from source):
CUDA/cuDNN version: 10.2 / 8.0.2
GPU model and memory: GTX 1080Ti and 12 GB

To Reproduce Steps and code: I have converted .pb weights to .onnx using https://github.com/onnx/tensorflow-onnx

with command python -m tf2onnx.convert --input modelInPb/Pedestrian_yolov3_520.pb --inputs input/input_data:0[1,416,416,3],input/lwir_input_data:0[1,416,416,3] --outputs pred_sbbox/concat_2:0,pred_mbbox/concat_2:0,pred_lbbox/concat_2:0 --output modelOut/Pedestrian_yolov3_520.onnx --opset 11

Then I use this .onnx model with the following code:

import numpy as np
from PIL import Image
import onnxruntime

# this function is from yolo3.utils.letterbox_image
def letterbox_image(image, size):
    '''resize image with unchanged aspect ratio using padding'''
    iw, ih = image.size
    w, h = size
    scale = min(w/iw, h/ih)
    nw = int(iw*scale)
    nh = int(ih*scale)

    image = image.resize((nw,nh), Image.BICUBIC)
    new_image = Image.new('RGB', size, (128,128,128))
    new_image.paste(image, ((w-nw)//2, (h-nh)//2))
    return new_image

def preprocess(img):
    model_image_size = (416, 416)
    boxed_image = letterbox_image(img, tuple(reversed(model_image_size)))
    image_data = np.array(boxed_image, dtype='float32')
    image_data /= 255.
    image_data = np.transpose(image_data, [2, 0, 1])
    image_data = np.expand_dims(image_data, 0)
    return image_data

#image = Image.open(img_path)
# input
#image_data = preprocess(image)

session = onnxruntime.InferenceSession('./Pedestrian_yolov3_520.onnx')
image = Image.open('./dog416.jpg')
# input
#image_data = preprocess(image)
#image_size = np.array([image.size[1], image.size[0]], dtype=np.int32).reshape(1, 2)

image_data = preprocess(image)
image_size = np.array([image.size[1], image.size[0]], dtype=np.float32).reshape(1, 2)

print(image_size)

inname = [input.name for input in session.get_inputs()][1]
lwir_inname = [input.name for input in session.get_inputs()][0]

outname = [output.name for output in session.get_outputs()]

print(inname)
print(lwir_inname)
print(outname)

boxes, scores, indices = session.run(outname, {inname: image_data , lwir_inname: image_data,  "image_shape":image_size})

#boxes, scores, indices = session.run(outname, {inname: image_data ,  "image_shape":image_size , lwir_inname: image_data , "image_shape":image_size})

#boxes, scores, indices = session.run(None, {"input_data:0": image_data, "image_shape":image_size , "lwir_input_data:0": image_data, "image_shape":image_size})

out_boxes, out_scores, out_classes = [], [], []
for idx_ in indices:
    out_classes.append(idx_[1])
    out_scores.append(scores[tuple(idx_)])
    idx_1 = (idx_[0], idx_[2])
    out_boxes.append(boxes[idx_1])

Attach the ONNX model to the issue (where applicable) to expedite the investigation. .onnx weights

https://drive.google.com/file/d/1vT5ZPH-LuW5cGrdENjWb2uOhvJBygSSk/view?usp=sharing

Expected behavior import the .onnx model and it should show the output same as the official Yolov3 example

Screenshots Attached screenshots for the error

Additional context I am also not sure if I am using in a right way boxes, scores, indices = session.run(outname, {inname: image_data , lwir_inname: image_data, "image_shape":image_size})

This model takes two inputs, so How I can pas two inputs? (visible image and infrared image), Can I use one image_size for both or I need to pass separately for both? The original Yolov3 example is boxes, scores, indices = session.run(outname, {inname: image_data , "image_shape":image_size}) How I can change this example for two inputs? Screenshot from 2020-11-16 11-47-12

zhanghuanrong commented 3 years ago

As far as I could see from the onnx model, there are only two inputs: [input/lwir_input_data:0, input/input_data:0], they are all of float32[1,416,416,3]. So:

do not send , "image_shape":image_size when call session.run()
reshape to correct shape during processing:
- Your images seems gray scale, so it is of shape [416, 416]? Try some RGB to get [416, 416, 3]
- expand its dim to [1, 416, 416, 3]. Thanks, Lei

MuhammadAsadJaved commented 3 years ago

As far as I could see from the onnx model, there are only two inputs: [input/lwir_input_data:0, input/input_data:0], they are all of float32[1,416,416,3]. So:

do not send , "image_shape":image_size when call session.run()

reshape to correct shape during processing:

Your images seems gray scale, so it is of shape [416, 416]? Try some RGB to get [416, 416, 3]

expand its dim to [1, 416, 416, 3]. Thanks, Lei

I am using RGB image. 416 x 416 because I only printed w and h. I will try do adjust pre-processing step. Thank you

khadija23 commented 1 year ago

I'm face same issue with my onnx model can anyone help me

khadija23 commented 1 year ago

onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: input for the following indices index: 1 Got: 3 Expected: 1 Please fix either the inputs or the model. I have trained a u2net model with midv500 dataset to built semantic segmentation model so and then use the exported model with rembg library in order to remove background image

khadija23 commented 1 year ago

this is a exemple of image shape and my onnx input image shape (1440, 2560, 3) onnx input shape [1, 1, 320, 320]

microsoft / onnxruntime

index: 1 Got: 3 Expected: 416 index: 3 Got: 416 Expected: 3 Please fix either the inputs or the model. #5819