BobLd / YOLOv4MLNet

Use the YOLO v4 and v5 (ONNX) models for object detection in C# using ML.Net
MIT License
79 stars 31 forks source link

How to use the repo to predicate custom size of image? #11

Closed zydjohnHotmail closed 2 years ago

zydjohnHotmail commented 2 years ago

Hello: I am new to YOLO, I have done the following. I create my own dataset with train model, and trained my model in ultralytics yolov5 Python repo. I generated one YoloV5 exported .onnx file. However, since my picture size is not a square: it is 768px by 432px. When I trained my dataset, I also used parameter: --img 768, but training finished successfully. Now, I want to use ML.NET to detect objects in some pictures. I created one WinForms App in Visual Studio 2019 (target .NET 5.0), add the following nuget packages: PM> Install-Package Microsoft.ML PM> Install-Package Microsoft.ML.ImageAnalytics PM> Install-Package Microsoft.ML.OnnxRuntime PM> Install-Package Microsoft.ML.OnnxTransformer I also created DataSturctues class. The following is the class “YoloV4BitmapData” definition:

` public class YoloV4BitmapData { [ColumnName("bitmap")] [ImageType(768, 432)] public Bitmap Image { get; set; }

        [ColumnName("width")]
        public float ImageWidth => Image.Width;

        [ColumnName("height")]
        public float ImageHeight => Image.Height;
    }

`

The other classes are not changed. The following is part of Main() program:

`using Microsoft.ML; using System; using System.Collections.Generic; using System.Diagnostics; using System.Drawing; using System.IO; using System.Windows.Forms; using static Microsoft.ML.Transforms.Image.ImageResizingEstimator; using static MLNetDetectObjectForm.DataStructure;

namespace MLNetDetectObjectForm { public partial class Form1 : Form { public const string Image_to_detect = @"C:\Videos\input.PNG"; public const string Image_output = @"C:\Videos\output.PNG"; const string modelPath = @"C:\Videos\MLNetDetectObjectForm\MLNetDetectObjectForm\Models\best.onnx"; const string imageFolder = @"Assets\Images"; const string imageOutputFolder = @"Assets\Output"; static readonly string[] classesNames = new string[] { "logo" };

    public Form1()
    {
        InitializeComponent();
    }

    private void Form1_Load(object sender, EventArgs e)
    {
        Image source_bmp = Image.FromFile(Image_to_detect);
        PictureBoxLogo.Image = source_bmp;
        MLContext mlContext = new();
        var pipeline = mlContext.Transforms.ResizeImages(inputColumnName: "bitmap", 
            outputColumnName: "input_1:0", imageWidth: 768, imageHeight: 432, resizing: ResizingKind.Fill)
            .Append(mlContext.Transforms.ExtractPixels(outputColumnName: "input_1:0", scaleImage: 1f / 255f, interleavePixelColors: true))
            .Append(mlContext.Transforms.ApplyOnnxModel(
                shapeDictionary: new Dictionary<string, int[]>()
                {
                    { "input_1:0", new[] { 1, 768, 768, 3 } },
                    { "Identity:0", new[] { 1, 52, 52, 3, 85 } },
                    { "Identity_1:0", new[] { 1, 26, 26, 3, 85 } },
                    { "Identity_2:0", new[] { 1, 13, 13, 3, 85 } },
                },
                inputColumnNames: new[]
                {
                    "input_1:0"
                },
                outputColumnNames: new[]
                {
                    "Identity:0",
                    "Identity_1:0",
                    "Identity_2:0"
                },
                modelFile: modelPath, recursionLimit: 100));

…… `

I can compile my code, but when I run it, I got error: Message "name (Parameter 'Onput tensor, Identity:0, does not exist in the ONNX model. Available output names are [output,402,471,540].')\r\nActual value was Identity:0." string I don’t quite understand its meaning. In my dataset in python, I have only one class/label name, it is called ‘logo’, I don’t have other class/label. I don’t know how to change my code, so that my trained model can detect the objects in the image of a picture box.
I have only one image file to check to see if my model will work: C:\Videos\input.PNG Do I have to change other classes, like: YoloV4Prediction or YoloV4Result. If yes, then how? Thanks,

AshwinRaikar88 commented 2 years ago

@zydjohnHotmail You could use model inspection tools like Netron then check what is the name of the input and output layers .... I guess as seen from the debug message your model has only one output layer called "output"

So you'll need to change your code accordingly from

shapeDictionary: new Dictionary<string, int[]>()
                {
                    { "input_1:0", new[] { 1, 768, 768, 3 } },
                    { "Identity:0", new[] { 1, 52, 52, 3, 85 } },
                    { "Identity_1:0", new[] { 1, 26, 26, 3, 85 } },
                    { "Identity_2:0", new[] { 1, 13, 13, 3, 85 } },
                },
                inputColumnNames: new[]
                {
                    "input_1:0"
                },
                outputColumnNames: new[]
                {
                    "Identity:0",
                    "Identity_1:0",
                    "Identity_2:0"
                },

To this :

shapeDictionary: new Dictionary<string, int[]>()
                {
                    { "input_1:0", new[] { 1, 768, 768, 3 } },
                    { "output", new[] { 1, 402, 471, 540 } },
                },
                inputColumnNames: new[]
                {
                    "input_1:0"
                },
                outputColumnNames: new[]
                {
                    "output",
                },