ML.NET object detection Predict() takes 5 seconds to process

MattiaDurli commented 3 years ago

System Information (please complete the following information):

OS & Version: Windows 10
ML.NET Version: ML.NET v1.6
.NET Version: NET 5.0

Describe the bug I've created a ONNX model for Object Detection with Visual Studio and ML Model Builder (using an Azure workspace), using VOTT to define the 4 objects I want to detect. I'm testing the model as explained in the tutorial, and it works well, detects the 4 objects, result is ok:

    var sampleData = new MLModel1.ModelInput()
    {
        ImageSource = @"C:\Data\sample1.jpg",
    };

    //Load model and predict output
    var result = MLModel1.Predict(sampleData);

Problem is it takes 5 seconds (10 seconds on first run, 5 on the following ones). Sample.jpg is a 700x400 pixels image, 85kb, the computer is an Intel i7 2.9GHz.

Am I doing something wrong or this is the speed I should expect? Here's the image, the objects to detect are the REF, LOT, the hourglass icon and the factory icon.

OCR_b

To Reproduce Followed the Object Detection sample on Documentation

Expected behavior Much faster processing time

michaelgsharp commented 3 years ago

@JakeRadMSFT is this normally the times you see from modelbuilder for this kind of stuff?

@MattiaDurli Can you share the pipeline code and model you are using?

MattiaDurli commented 3 years ago

Yes, this is the pipeline created by model builder:

IEstimator<ITransformer> pipeline =  mlContext.Transforms.LoadImages(@"input", @"ImageSource")
                                    .Append(mlContext.Transforms.ResizeImages(imageWidth: 800, imageHeight: 600, outputColumnName: @"input", inputColumnName: @"input", cropAnchor: ImageResizingEstimator.Anchor.Center, resizing: ImageResizingEstimator.ResizingKind.Fill))
                                    .Append(mlContext.Transforms.ExtractPixels(outputColumnName: @"input", inputColumnName: @"input", colorsToExtract: ImagePixelExtractingEstimator.ColorBits.Rgb, orderOfExtraction: ImagePixelExtractingEstimator.ColorsOrder.ARGB))
                                    .Append(mlContext.Transforms.ApplyOnnxModel(modelFile: @"C:\Data\Code\Projects\CTServerCV\CTServerCV\LabelDetection\MLModel1.onnx"));

Here's the model created: https://www.dropbox.com/s/370drmd1yuyg5kv/MLModel1.zip?dl=0

I really haven't done anything, just followed the default for Object Detection with AutoML I edited the consumption sample created by AutoML and measured now the Predict function only (in my previous benchmark I calculated also the MLContext creation), and time dropped from 5s to 2.5s, but still seems a lot to me.

lovettchris commented 3 years ago

You might try separating the loading of the engine from the prediction time, like this:

var engine = ConsumeModel.CreatePredictionEngine();
stopwatch.Start();
ModelOutput result = engine.Predict(input);

or do the prediction multiple times to count out the first prediction time and see what the average is after that...

MattiaDurli commented 3 years ago

Yes I tried that, and time dropped from 5s to 2.5s but still seems a lot to me. I try to locate only 4 different objects, and I've seen demos with other frameworks where the object recognition is at multiple times per second, that's why I suspect something's wrong.

michaelgsharp commented 3 years ago

Hey @MattiaDurli, can you upload your model and OnnxModel with 1 sample and code to run it? I have an idea I want to play around with. If you are still looking into this anyways.

dotnet / machinelearning

ML.NET object detection Predict() takes 5 seconds to process #5881