System Information (please complete the following information):
OS & Version: Windows 11
ML.NET Version: ML.NET 1.7.0 and 3.0
.NET Version: .NET 8
Describe the bug
I trained my model last year, with ML Model Builder and ML.NET 1.7 with a set of images annotated with VOTT. The training was in Azure because at that time it wasn't possible to train locally. Then I produced the console sample application that loads my ONNX MLModel1.zip.
With a reference image, after the first load, it takes 0.58s on my current machine to detect the objects.
My computer has GPU but I assume it's not used at this stage.
I retrained my model now, this time locally, with my GPU, with the same image set and annotations, and a file MLModel1.mlnet is created, and the sample console app has tochsharp nuget instead of onnx nugets.
If I test (after first load) with
mlContext.GpuDeviceId = null;
mlContext.FallbackToCpu = true;
now it takes 2.1s
If I test (after first load) with
mlContext.GpuDeviceId = 0;
mlContext.FallbackToCpu = false;
now it takes 0.68s
The older version, even without using CPU, is faster.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
An improvement
Screenshots, Code, Sample Projects
The image I use as reference
System Information (please complete the following information):
Describe the bug I trained my model last year, with ML Model Builder and ML.NET 1.7 with a set of images annotated with VOTT. The training was in Azure because at that time it wasn't possible to train locally. Then I produced the console sample application that loads my ONNX MLModel1.zip. With a reference image, after the first load, it takes 0.58s on my current machine to detect the objects. My computer has GPU but I assume it's not used at this stage.
I retrained my model now, this time locally, with my GPU, with the same image set and annotations, and a file MLModel1.mlnet is created, and the sample console app has tochsharp nuget instead of onnx nugets. If I test (after first load) with mlContext.GpuDeviceId = null; mlContext.FallbackToCpu = true; now it takes 2.1s If I test (after first load) with mlContext.GpuDeviceId = 0; mlContext.FallbackToCpu = false; now it takes 0.68s
The older version, even without using CPU, is faster.
To Reproduce Steps to reproduce the behavior:
Expected behavior An improvement
Screenshots, Code, Sample Projects The image I use as reference![referencetest](https://github.com/dotnet/machinelearning/assets/6052847/0df786ca-95dc-45e4-be05-812f59778265)