Question about how to use other models

Moskito89 commented 2 months ago

Dear @ankane,

thanks for this library! I was following your documentation, and it worked well, but when I tried using another model from https://github.com/onnx/models I came into trouble.

The problem is that when I want to use yolov4 for example, an error appears, saying: PHP Fatal error: Uncaught OnnxRuntime\Exception: Unknown input: inputs in /vendor/ankane/onnxruntime/src/InferenceSession.php:350 I understand, that the model expects a different input, but I don’t understand, from where I can get the information about it? Calling the inputs()-method returns

array:1 [
  0 => array:3 [
    "name" => "input_1:0"
    "type" => "tensor(float)"
    "shape" => array:4 [
      0 => "unk__2104"
      1 => 416
      2 => 416
      3 => 3
    ]
  ]
]

but I don’t understand that information. Can you give me a hint? Thank you very much in advance!

ankane commented 2 months ago

Hi @Moskito89, it looks like you're passing an input named inputs when the model expects input_1:0.

$model->predict(['input_1:0' => ...]);

Moskito89 commented 2 months ago

Thanks, @ankane, that's what I thought as well! The problem is, that changing the input's name leads to another error: PHP Fatal error: Uncaught OnnxRuntime\Exception: Unknown input: inputs_1:0 in /vendor/ankane/onnxruntime/src/InferenceSession.php:350

ankane commented 2 months ago

It looks like you're passing inputs_1:0 instead of input_1:0.

Moskito89 commented 2 months ago

You're right, @ankane, that was a mistake! 👍 Now the next error appears: PHP Fatal error: Uncaught OnnxRuntime\Exception: Got invalid dimensions for input: input_1:0 for the following indices index: 1 Got: 427 Expected: 416 index: 2 Got: 640 Expected: 416 Please fix either the inputs/outputs or the model. in /vendor/ankane/onnxruntime/src/InferenceSession.php:574 This sounds to me like the pixels should be extracted differently?

ankane commented 2 months ago

You'll need to make sure it matches the shape returned from the inputs() method.

Moskito89 commented 2 months ago

Absolute! But how can I understand the information about the shape? What does it say to us?

array:1 [
  0 => array:3 [
    "name" => "input_1:0"
    "type" => "tensor(float)"
    "shape" => array:4 [
      0 => "unk__2104"
      1 => 416
      2 => 416
      3 => 3
    ]
  ]
]

Maybe this is the heart of my question. :-)

CodeWithKyrian commented 2 months ago

Hello @Moskito89, I get your question.

The input shape [1, 416, 416, 3] indicates the model expects a square RGB image. YOLO models are vision models so it's easier to understand the input shape. The dimensions are specified in BHWC format, which stands for Batch size, Height, Width, and Channel.

Indices 1, 2, and 3 represent the image's height (416), width (416), and channel (3), respectively. The batch size (index 0) depends on the number of images processed at once. For eg., 1 signifies a single image. It's a dynamic dimension so it can change.

To feed the image into the model, you'll need to extract its pixel information and reshape it into a multidimensional array of shape [1, 416, 416, 3], assuming you're processing a single image.

I hope this helps.

CodeWithKyrian commented 2 months ago

Here. Check this documentation for yolov4 cos you're going to also need to be able to interpret the model outputs to use it for anything creative

Moskito89 commented 2 months ago

Thank you, @CodeWithKyrian, I'm beginning to understand this! So the name of the input has to be configured individually for each model. In addition, each model has its own expectations of the input and each model returns its own information. I didn't know that!

ankane / onnxruntime-php

Question about how to use other models #6