dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
9.02k stars 1.88k forks source link

Image clasification not working for onnx model #6886

Open gktval opened 10 months ago

gktval commented 10 months ago

System Information (please complete the following information):

Describe the bug I am loading an model trained in tensorflow and then converted to an onnx model. When making predictions, all the outputs are the first class even though they should not be. Also the prediction values vary only by a little. Here is an example of two output predictions:

[1.5047534, -0.16131932, 0.71119475] [1.5056, -0.16168906, 0.7110559]

This seems odd that there are negative values. So I don't think it is correct from that aspect. I thought it might be because the model was trained on RGBA but the MLContext transform uses ARGB/ABGR/etc. I tried switching the channels of the input images, but this did not help. Any ideas of what I am doing wrong with my code below?

This classification model is almost identical to the model I trained with, except I am using 4 bands instead of 3: https://www.tensorflow.org/tutorials/images/classification

Here is the code for loading the model:

ITransformer GetPredictionPipeline(MLContext mlContext)
{
var pipeline2 = mlContext
.Transforms
// Adjust the image to the required model input size
.ResizeImages(
    inputColumnName: "sequential_input",
    imageWidth: 224,
    imageHeight: 224,
    outputColumnName: "resized"
)
// Extract the pixels form the image as a 1D float array, but keep them in the same order as they appear in the image.
.Append(mlContext.Transforms.ExtractPixels(
    inputColumnName: "resized",
    colorsToExtract: Microsoft.ML.Transforms.Image.ImagePixelExtractingEstimator.ColorBits.All,
    orderOfExtraction: Microsoft.ML.Transforms.Image.ImagePixelExtractingEstimator.ColorsOrder.ARGB,
    outputColumnName: "sequential_input",
    scaleImage: 1 / 255f)
)
// Perform the estimation
.Append(mlContext.Transforms.ApplyOnnxModel(
        modelFile: "./model.onnx",
        inputColumnName: "sequential_input",
        outputColumnName: "dense_1,
        gpuDeviceId: 0
    )
);

var emptyDv = mlContext.Data.LoadFromEnumerable(new OnnxInput[] { });

var model = pipeline2.Fit(emptyDv);
return model;
}

My input/output look like this:

public class OnnxInput
{
    public const int ImageWidth = 224;
    public const int ImageHeight = 224;

    [ColumnName("sequential_input")]
    [ImageType(ImageWidth, ImageHeight)]
    public MLImage Image { get; set; }
}

public class OnnxOutput
{
    [ColumnName("dense_1")]
    public float[] Output { get; set; }
}

Finally, the prediction looks like this:

MLContext mlContext = new MLContext();
var onnxPredictionPipeline = GetPredictionPipeline(mlContext);
var onnxPredictionEngine = mlContext.Model.CreatePredictionEngine<OnnxInput, OnnxOutput>(onnxPredictionPipeline);

for (int j = 0; j < files.Length; j++)
{
    string pngFile = files[i];

    // Create single instance of sample data from first line of dataset for model input
    var mlImage = MLImage.CreateFromFile(pngFile);
    var testInput = new OnnxInput
    {
        Image = mlImage 
    };
    var prediction = onnxPredictionEngine.Predict(testInput);
}