dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
9.05k stars 1.89k forks source link

Loading Existing ML.NET Image Classification Model and Adapting Predict To Use InMemory Images #5718

Closed jamsoft closed 2 years ago

jamsoft commented 3 years ago

I originally raised this as a question on SO to which @michaelgsharp asked if I could bring this discussion to an issue here to track against the repo.

I've been looking at the various examples for achieving this and struggling to get it working in my situation. It seems the advice used to be to create a completely Custom IDataView according to this answer on GitHub.

But this was before there was a new attribute denoting Image types making the first example obsolete as far as I can tell. There is a unit test showing this newer approach here.

The issue is these all seem to re-save the MLModel.zip file as part of their setup setting a new input schema whereas I just want to adapt the inputs at runtime. The examples all seem to be much more complex situations where they are either adapting TensorFlow models in the pipelines or standardizing images. My datasets are all already standardized and in ML.NET formats.

The original input object looks like this:

public class ModelInput
{
    [ColumnName("Label"), LoadColumn(0)]
    public string Label { get; set; }    

    [ColumnName("ImageSource"), LoadColumn(1)]
    public string ImageSource { get; set; }
}

And I would like to be able to pass in an object like this:

public class MlClientModelInput
{
    [ColumnName("Label"), LoadColumn(0)]
    public string Label { get; set; }

    [ColumnName("Image"), LoadColumn(1), ImageType(328, 256)]
    public Bitmap Image { get; set; }
}

I've been trying various things for hours and getting nowhere at all. Even attempting the re-saving option isn't working as I can't get the methodology of not having to process the images and loading the existing model. It always complains about the schema is wrong.

So far I've been trying with this code:

public static PredictionEngine<MlClientModelInput, MlClientModelOutput> Create()
{
    // Create new MLContext
    MLContext mlContext = new MLContext();

    var dataProcessPipeline = mlContext.Transforms.ResizeImages("Image", 328, 256);

    //var model = mlContext.Model.Load(MLNetModelPath, out var modelInputSchema);
    //var pipeline = mlContext.Transforms.ConvertToGrayscale("GrayImage", "Image");

    ITransformer model = dataProcessPipeline.Fit(CreateEmptyDataView(mlContext));
    // Load model & create prediction engine
    ITransformer mlModel = mlContext.Model.Load(MLNetModelPath, out var modelInputSchema);

    mlContext.Model.Save(mlModel, null, MLNetModelPathEdited);

    // Create new MLContext
    MLContext mlContext2 = new MLContext();
    ITransformer mlModel2 = mlContext.Model.Load(MLNetModelPathEdited, out var modelInputSchema2);
    var predEngine = mlContext.Model.CreatePredictionEngine<MlClientModelInput, MlClientModelOutput>(mlModel);

    return predEngine;
}

I've now found an example that is almost exactly what I need but it's also loading a model differently so I'm still unsure how to implement a pipeline without having to standardize the data. It's all very confusing.

using System;
using System.Collections.Generic;
using System.Drawing;
using System.Linq;
using Microsoft.ML;
using ObjectDetection.Core;

namespace ObjectDetection
{
    public class OnnxModelScorer
    {
        private readonly string imagesLocation;
        private readonly string imagesFolder;
        private readonly string modelLocation;
        private readonly MLContext mlContext;

        private IList<YoloBoundingBox> _boxes = new List<YoloBoundingBox>();
        private readonly YoloWinMlParser _parser = new YoloWinMlParser();

        public OnnxModelScorer(string modelLocation)
        {
            this.modelLocation = modelLocation;

            mlContext = new MLContext();
        }

        public struct ImageNetSettings
        {
            public const int imageHeight = 416;
            public const int imageWidth = 416;
        }

        public struct TinyYoloModelSettings
        {
            // for checking TIny yolo2 Model input and  output  parameter names,
            //you can use tools like Netron, 
            // which is installed by Visual Studio AI Tools

            // input tensor name
            public const string ModelInput = "image";

            // output tensor name
            public const string ModelOutput = "grid";
        }

        public void Score(Bitmap image)
        {
            var imageData = new BitmapDataView(image);
            var model = LoadModel(imageData);

            PredictDataUsingModel(imageData, model);
        }

        private PredictionEngine<BitmapDataView, ImageNetPrediction> LoadModel(BitmapDataView imageData)
        {
            Console.WriteLine("Read model");
            Console.WriteLine($"Model location: {modelLocation}");
            Console.WriteLine($"Images folder: {imagesFolder}");
            Console.WriteLine($"Default parameters: image size=({ImageNetSettings.imageWidth},{ImageNetSettings.imageHeight})");

            //var data = mlContext.Data.LoadFromTextFile<ImageNetData>(imagesLocation, hasHeader: true);

            var pipeline = mlContext.Transforms.ResizeImages(outputColumnName: "image", imageWidth: ImageNetSettings.imageWidth, imageHeight: ImageNetSettings.imageHeight, inputColumnName: "image")
                            .Append(mlContext.Transforms.ExtractPixels(outputColumnName: "image"))
                            .Append(mlContext.Transforms.ApplyOnnxModel(modelFile: modelLocation, outputColumnNames: new[] { TinyYoloModelSettings.ModelOutput }, inputColumnNames: new[] { TinyYoloModelSettings.ModelInput }));

            var model = pipeline.Fit(imageData);

            var predictionEngine = mlContext.Model.CreatePredictionEngine<BitmapDataView, ImageNetPrediction>(model);

            return predictionEngine;
        }

        protected void PredictDataUsingModel(BitmapDataView data, PredictionEngine<BitmapDataView, ImageNetPrediction> model)
        {
            Console.WriteLine($"Tags file location: {imagesLocation}");
            Console.WriteLine("");
            Console.WriteLine("=====Identify the objects in the images=====");
            Console.WriteLine("");

            var probs = model.Predict(data).PredictedLabels;
            IList<YoloBoundingBox> boundingBoxes = _parser.ParseOutputs(probs);
            var filteredBoxes = _parser.NonMaxSuppress(boundingBoxes, 5, .5F);

            //Console.WriteLine(".....The objects in the image {0} are detected as below....", sample.Label);
            foreach (var box in filteredBoxes)
            {
                Console.WriteLine(box.Label + " and its Confidence score: " + box.Confidence);
            }
            Console.WriteLine("");

            //var testData = ImageNetData.ReadFromCsv(imagesLocation, imagesFolder);

            //foreach (var sample in testData)
            //{
            //    var probs = model.Predict(sample).PredictedLabels;
            //    IList<YoloBoundingBox> boundingBoxes = _parser.ParseOutputs(probs);
            //    var filteredBoxes = _parser.NonMaxSuppress(boundingBoxes, 5, .5F);

            //    Console.WriteLine(".....The objects in the image {0} are detected as below....", sample.Label);
            //    foreach (var box in filteredBoxes)
            //    {
            //        Console.WriteLine(box.Label + " and its Confidence score: " + box.Confidence);
            //    }
            //    Console.WriteLine("");
            //}
        }
    }
}

What I'm trying to get to is being able to call the prediction engine like this:

public static MlClientModelOutput Score(Bitmap image)
{
    var imageData = new BitmapDataView(image);
    var model = LoadModel(imageData);

    return model.Predict(imageData);
}

What's so confusing is in every example I can find doing what I'm trying to do, they never seem to load the actual model I have in .ZIP format. Even in this full example they never load anything from a generated ML.NET model zip file. In my code, if I copied this pattern it would blow up as I wouldn't have executed this code:

ITransformer mlModel = mlContext.Model.Load(MLNetModelPath, out var modelInputSchema);

michaelgsharp commented 3 years ago

Thanks for reaching out. I saw your question on stack overflow about this. What you are trying to do is possible, and you don't need to do a custom IDataView. This example goes over it, have you looked at it before?

I think the biggest thing to note is that the pipeline is split into 2 steps. The first step does all the image manipulation stuff needed to load an image from disk and transform it as needed. The second step takes an in memory image and does the rest of the pipeline with that. This way you can train using images on disk and then run using in memory images. If you can send a sample image and your code/model we can help with details.

The reason many of the examples don't load the model is that they are training it during the example, and so they don't need to load it. Sometimes the example will save the model just to show how to do that, but often times wont reload it since its already in memory.

jamsoft commented 3 years ago

No problem, Thanks for picking it up. I will have a look over the link you provided in the next few days and see how I get on. I'll report back asap. Thanks.

jamsoft commented 3 years ago

FINALLY! Just found some time to look over this. This does indeed look much more promising than any of the other examples.

I'm going to take a closer look at this and work this into my code.

I think one of the issues I was hitting was being a bit confused by all the boiler plate code spat out of the ML GUI tool not covering my use case and then trying to work with that code to introduce this feature.

I think I'm going to remove the GUI tool from my workflow and use this example code for the entire process. The test / train split feature is seriously nice.

NarwhalRoger commented 2 years ago

Hi, I am facing similar issue (having no idea how to implement in memory image prediction and etc) and I have looked at the code suggested by @michaelgsharp above. Maybe I got something wrong, but those "InMemoryImageData" objects are not "In memory". They are created with a function called "LoadInMemoryImagesFromDirectory" which doesn't solve the issue because these so called "In memory" images are not "In memory" since they are being downloaded from a image file. Is it the only way to implement this?

I have searched a lot about "In memory" image use in ML.Net and there are lots of people who say that it is possible, but I have not found any working code that implements true "In memory" images for predictions from a Microsoft ML models.

If you know how to implement them or have a working code, please share it. This Machine Learning tool has a too few examples and explanations

michaelgsharp commented 2 years ago

@NarwhalRoger you should still be able to use that example, just start from the step after we have loaded them. As long as your images are stored as a byte array they can already be in memory. The image data has to come from somewhere, and we can't use something like a camera for our public examples since not everyone would be able to use it, so we mostly have to store them on disk somewhere. We are working on getting more examples for this.

If you have more questions please open a new issue and share your example pipeline so we can take a look. I'm going to close this issue since the original one was resolved and I should have closed it already.