Issue in reproducing inference results

saba155 commented 8 months ago

After fine-tuning a YOLO v8 model with Ultralytics in Python, I exported the model to ONNX format. However, when I use the exported ONNX model for inference using Ultralytics' Python script, the results differ from the inference results obtained through ML.NET's C# implementation. I have not made any code modifications. Can you suggest if there are differences in image pre-processing or post-processing between the Ultralytics repository and the ML.NET implementation?

sstainba commented 8 months ago

I can't. I don't know what ultalytics does in their processing.

From: saba155 @.> Sent: Sunday, January 14, 2024 8:29:19 AM To: sstainba/Yolov8.Net @.> Cc: Subscribed @.***> Subject: [sstainba/Yolov8.Net] Issue in reproducing inference results (Issue #39)

I exported onnx model after fine-tuning using ultralytics yolo v8 implementation in python. The inference results the same exported onnx model produce from ultralytics python script are different from ml.net c# implementation. Can you please suggest if there are any differences in pre or post processing of images in this repository than ultralytics implementation? I haven't made any changes in the code.

— Reply to this email directly, view it on GitHubhttps://github.com/sstainba/Yolov8.Net/issues/39, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALAREYXS75ZV25PLVCM6GBDYOPTT7AVCNFSM6AAAAABB2EQ3SKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA4DANZVGEZTCMQ. You are receiving this because you are subscribed to this thread.Message ID: @.***>

Talla2k commented 6 months ago

The author's current code transforms any image into a square (640x640), whereas in the original, the larger side should be 640, and the second side should be proportional to the first and a multiple of 32. This is where all the problems arise. To work with any images, it is necessary to set the dynamic=True flag when exporting to ONNX. Additionally, there are some minor changes needed in the code.

saba155 commented 5 months ago

thanks for your reply. yes @Talla2k the main difference between Pytorch implementation and the author's implementation is indeed in resizing input images. The Ultralytic's Pytorch implementation uses letter box technique to resize images to preserve original image aspect ratio and avoid any stretching etc. Inference results I got after applying letter box resizing and padding are quite similar. Here is the sample code how I performed letter boxing using SixLabors.ImageSharp in case if anyone need help in this scenario in future:

using SixLabors.ImageSharp;
using SixLabors.ImageSharp.PixelFormats;
private Image<Rgb24> ApplyLetterbox(Image<Rgb24> image)
    {
        var shape = (image.Height, image.Width);
        var newShape = letterbox.NewShape;  

        var r = Math.Min(newShape.Item1 / (double)shape.Item1, newShape.Item2 / (double)shape.Item2);

        if (!letterbox.Scaleup)
        {
            r = Math.Min(r, 1.0);
        }

        var ratio = (r, r);
        var newUnpad = ((int)Math.Round(shape.Item2 * r), (int)Math.Round(shape.Item1 * r));
        var dw = newShape.Item2 - newUnpad.Item1;
        var dh = newShape.Item1 - newUnpad.Item2;

        if (letterbox.Auto)
        {
            dw %= letterbox.Stride;
            dh %= letterbox.Stride;
        }
        else if (letterbox.ScaleFill)
        {
            dw = 0;
            dh = 0;
            newUnpad = (newShape.Item2, newShape.Item1);
            ratio = (newShape.Item2 / (double)shape.Item2, newShape.Item1 / (double)shape.Item1);
        }

        if (letterbox.Center)
        {
            dw /= 2;
            dh /= 2;
        }

        if (shape != newUnpad)
        {
            image.Mutate(x => x.Resize(new ResizeOptions
            {
                Size = new Size(newUnpad.Item1, newUnpad.Item2),
                Mode = ResizeMode.Linear
            }));
        }

        var top = (int)Math.Round(dh - 0.1);
        var bottom = (int)Math.Round(dh + 0.1);
        var left = (int)Math.Round(dw - 0.1);
        var right = (int)Math.Round(dw + 0.1);

        var borderColor = new Rgb24(114, 114, 114);
        image.Mutate(x => x.Pad(image.Width + left + right, image.Height + top + bottom, borderColor));

        return image;
    }

sstainba / Yolov8.Net

Issue in reproducing inference results #39