Use Bitmaps & ImageType attribute in codegen for image scenarios

luisquintanilla commented 4 years ago

Is your feature request related to a problem? Please describe.

In scenarios involving images, it's often the case that image files are used for training and for scoring, image bytes are used. This workflow is challenging because the pipeline isn't flexible to handle both scenarios. In the case that files are used for training, users have to create files when scoring, adding more time to inferencing operations, taking up space in the users PC and forcing the user to handle proper cleanup.

Describe the solution you'd like

A solution to the problem would be to have the pipeline take in Bitmaps as input which are decorated with the ImageType attribute

Instead of using the image path (string):

    public class ModelInput
    {
        [ColumnName("Label"), LoadColumn(0)]
        public string Label { get; set; }

        [ColumnName("ImageSource"), LoadColumn(1)]
        public string ImageSource { get; set; }
    }

change it to a Bitmap

    public class ModelInput
    {
        [ColumnName("Label"), LoadColumn(0)]
        public string Label { get; set; }

        [ColumnName("ImageSource"), LoadColumn(1), ImageType(224,224)]
        public Bitmap ImageSource { get; set; }
    }

Bitmaps make it flexible to use both files as well as bytes or streams via Image.FromFile and Image.FromStream methods. Depending on how users are going to provide their images to the model for inferencing, the pipeline handles both cases.

Any additional operations such as resizing or pixel extraction operate on Bitmaps. The one change that would need to take place in the pipeline though is, when using Bitmap, there is no longer a need to use the LoadImages transform, since you'd be providing the Bitmap directly.

            var pipeline = mlContext.Transforms.ResizeImages("ImageSource_featurized", 224, 224, "ImageSource")
                .Append(mlContext.Transforms.ExtractPixels("ImageSource_featurized", "ImageSource_featurized"))
                .Append(mlContext.Transforms.CustomMapping<NormalizeInput, NormalizeOutput>(
                    (input, output) => NormalizeMapping.Mapping(input, output),
                    contractName: nameof(NormalizeMapping)))
                .Append(mlContext.Transforms.ApplyOnnxModel(modelFile: ONNX_MODEL))
                .Append(mlContext.Transforms.CustomMapping<LabelMappingInput, LabelMappingOutput>(
                    (input, output) => LabelMapping.Mapping(input, output),
                    contractName: nameof(LabelMapping)));

LittleLittleCloud commented 4 years ago

@JakeRadMSFT @briacht thought?

JakeRadMSFT commented 4 years ago

Yes! We just need to make sure w have the training pipeline too.

maartendweerdt commented 4 years ago

Any idea when this will be added? Is the only workaround currently not using Modelbuilder, or is there some hacky way to change the modelbuilder generated code at this stage?

centrolutions commented 4 years ago

I agree with this change. Having to store a file in the file system, run the model against it, and then cleanup the file is causing me all kinds of problems. Being able to load an image from bytes or something similar would be great and would be much more flexible.

Does anyone have a workaround until the team can (hopefully) make this change?

luisquintanilla commented 4 years ago

@centrolutions @maartendweerdt Thanks for your interest in this change. I'll see if I can come up with a sample. Currently the pipelines for cloud training on Azure and local training are slightly different so they might require slightly different solutions. I'll post on here when I have a sample workaround setup

centrolutions commented 4 years ago

For anyone who stumbles on this; trying to feed a Bitmap into an object detection model, and predict on it, here's what seems to be working for me:

Run your data through the model builder extension per usual. The extension will generate some sample code. The sample code includes a ModelBuilder.cs file in the .ConsoleApp project and a ModelInput.cs class in the .Model project.

Make the following change to the ModelInput class:

//[ColumnName("ImageSource"), LoadColumn(1)]
//public string ImageSource { get; set; }
[ColumnName("ImageSource"), LoadColumn(1), ImageType(800, 600)]
public Bitmap ImageSource { get; set; }

Make the following three changes to the ModelBuilder class:

Add this function:

private static IEnumerable<ModelInput> GetModelInputs()
    {
        var lines = File.ReadAllLines(TRAIN_DATA_FILEPATH);
        for (var i = 1; i < lines.Length; i++)
        {
            var columns = lines[i].Split(','); //naive csv parsing
            yield return new ModelInput()
            {
                Label = columns[0],
                ImageSource = (Bitmap)Bitmap.FromFile(columns[1]), //ouch -- lots of in-memory bitmaps
            };
        }
    }

Update the CreateMLNetModelFromOnnx method:

        //IDataView inputDataView = mlContext.Data.LoadFromTextFile<ModelInput>(
        //                                path: TRAIN_DATA_FILEPATH,
        //                                hasHeader: true,
        //                                separatorChar: ',',
        //                                allowQuoting: true,
        //                                allowSparse: false);
        IDataView inputDataView = mlContext.Data.LoadFromEnumerable<ModelInput>(GetModelInputs());

Lastly, update the BuildPipeline function:

//var pipeline = mlContext.Transforms.LoadImages("ImageSource_featurized", null, "ImageSource")
//                          .Append(mlContext.Transforms.ResizeImages(outputColumnName: "input", imageWidth: 800, imageHeight: 600, inputColumnName: "input"))
//                          .Append(mlContext.Transforms.ExtractPixels("input", "ImageSource_featurized"))
//                          .Append(mlContext.Transforms.ApplyOnnxModel(modelFile: ONNX_MODEL));
var pipeline = mlContext.Transforms.ResizeImages("ImageSource_featurized", 800, 600, "ImageSource")
            .Append(mlContext.Transforms.ExtractPixels("input", "ImageSource_featurized"))
            .Append(mlContext.Transforms.ApplyOnnxModel(modelFile: ONNX_MODEL));

Finally, change the Program -> Main function in the .ConsoleApp project to call the ModelBuilder.CreateMLNetModelFromOnnx() method to generate a new MLModel.zip file.

After all that, replace the existing MLModel.zip file in the root of the .Model project, switch the Program -> Main method code back to it's original state, but with this change:

            Bitmap bmp = (Bitmap)Bitmap.FromFile(@"D:\Pictures\test.jpg");
            ModelInput sampleData = new ModelInput()
            {
                //ImageSource = @"D:\Pictures\test.jpg",
                ImageSource = bmp,
            };

Run the program and ensure your prediction results are the same as before.

Note: you should only use a limited (small) dataset to create the ML model since this code loads the images into memory. I'm sure there's a better way to do this using lazy loading or something like that, but this is my "brute force" workaround. Standard disclaimers apply -- test this for yourself, works on my machine, do not use for production, etc, etc.

JakeRadMSFT commented 3 years ago

Investigate if we can generate code that supports Path and Bitmap

LittleLittleCloud commented 3 years ago

Thanks @luisquintanilla , @maartendweerdt and @centrolutions for your suggestions. The new ModelInput class will be (steal from @luisquintanilla's example, except I remove the LoadColumn attribution) to support load images from memory.

    public class ModelInput
    {
        [ColumnName("Label")]
        public string Label { get; set; }

        [ColumnName("ImageSource"), ImageType(224,224)]
        public Bitmap ImageSource { get; set; }
    }

And for local image classification, which uses LoadRawImageBytes to load images from file path

    public class ModelInput
    {
        [ColumnName("Label")]
        public string Label { get; set; }

        [ColumnName("ImageSource")]
        public float[] ImageSource { get; set; }
    }

sarah-graf commented 3 years ago

Hi @LittleLittleCloud, I'm trying to set bitmaps as input for my training pipeline. However, after I implemented @centrolutions approach, I still get a schema mismatch error (schema mismatch for input column 'ImageSource': expected String, got Image <800, 600>) when the TrainModel method in the ModelBuilder.cs is called. From this I conclude that I still have to adjust the ImageSource somewhere, but I don't know where, do you have any information about it? Thanks in advance!

LittleLittleCloud commented 3 years ago

@sarah-graf

What's your training pipeline looks like, and it will be great if you can paste your ModelBuilder.cs here.

sarah-graf commented 3 years ago

@LittleLittleCloud thanks for your reply! Here is my ModelBuilder.cs with the trainings pipeline I am using. The error always occurs when the TrainModel method is called.


using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;
using ModelCreatorMLNetML.Model;
using Microsoft.ML.Vision;
using System.Drawing;

namespace ModelCreatorMLNetML.ConsoleApp
{
    public static class ModelBuilder
    {
        private static string TRAIN_DATA_FILEPATH = @"C:\Users\XX\AppData\Local\Temp\d4b30f27-6dcb-4452-85b4-f36028ff0e9d.tsv";
        private static string MODEL_FILEPATH = @"C:\Users\XX\AppData\Local\Temp\MLVSTools\ModelCreatorMLNetML\ModelCreatorMLNetML.Model\MLModel.zip";
        private static MLContext mlContext = new MLContext(seed: 1);

        public static void CreateModel()
        {
            // Load Data
            IDataView trainingDataView = mlContext.Data.LoadFromEnumerable<ModelInput>(GetModelInputs());

            // Build training pipeline
            IEstimator<ITransformer> trainingPipeline = BuildTrainingPipeline(mlContext);

            // Train Model
            ITransformer mlModel = TrainModel(mlContext, trainingDataView, trainingPipeline);

            // Evaluate quality of Model
            Evaluate(mlContext, trainingDataView, trainingPipeline);

            // Save model
            SaveModel(mlContext, mlModel, MODEL_FILEPATH, trainingDataView.Schema);
        }

        public static IEstimator<ITransformer> BuildTrainingPipeline(MLContext mlContext)
        {
            // Data process configuration with pipeline data transformations 
            var dataProcessPipeline = mlContext.Transforms.Conversion.MapValueToKey("Label", "Label")
                                      .Append(mlContext.Transforms.LoadRawImageBytes("ImageSource_featurized", null, "ImageSource"))
                                      .Append(mlContext.Transforms.CopyColumns("Features", "ImageSource_featurized"));

            // Set the training algorithm 
            var trainer = mlContext.MulticlassClassification.Trainers.ImageClassification(new ImageClassificationTrainer.Options() { LabelColumnName = "Label", FeatureColumnName = "Features" })
                                      .Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabel", "PredictedLabel"));

            var trainingPipeline = dataProcessPipeline.Append(trainer);

            return trainingPipeline;
        }

        public static ITransformer TrainModel(MLContext mlContext, IDataView trainingDataView, IEstimator<ITransformer> trainingPipeline)
        {
            Console.WriteLine("=============== Training  model ===============");

            ITransformer model = trainingPipeline.Fit(trainingDataView);

            Console.WriteLine("=============== End of training process ===============");
            return model;
        }

        private static void Evaluate(MLContext mlContext, IDataView trainingDataView, IEstimator<ITransformer> trainingPipeline)
        {
            // Cross-Validate with single dataset (since we don't have two datasets, one for training and for evaluate)
            // in order to evaluate and get the model's accuracy metrics
            Console.WriteLine("=============== Cross-validating to get model's accuracy metrics ===============");
            var crossValidationResults = mlContext.MulticlassClassification.CrossValidate(trainingDataView, trainingPipeline, numberOfFolds: 5, labelColumnName: "Label");
            PrintMulticlassClassificationFoldsAverageMetrics(crossValidationResults);
        }

        private static void SaveModel(MLContext mlContext, ITransformer mlModel, string modelRelativePath, DataViewSchema modelInputSchema)
        {
            // Save/persist the trained model to a .ZIP file
            Console.WriteLine($"=============== Saving the model  ===============");
            mlContext.Model.Save(mlModel, modelInputSchema, GetAbsolutePath(modelRelativePath));
            Console.WriteLine("The model is saved to {0}", GetAbsolutePath(modelRelativePath));
        }

        public static string GetAbsolutePath(string relativePath)
        {
            FileInfo _dataRoot = new FileInfo(typeof(Program).Assembly.Location);
            string assemblyFolderPath = _dataRoot.Directory.FullName;

            string fullPath = Path.Combine(assemblyFolderPath, relativePath);

            return fullPath;
        }

        public static void PrintMulticlassClassificationMetrics(MulticlassClassificationMetrics metrics)
        {
            Console.WriteLine($"************************************************************");
            Console.WriteLine($"*    Metrics for multi-class classification model   ");
            Console.WriteLine($"*-----------------------------------------------------------");
            Console.WriteLine($"    MacroAccuracy = {metrics.MacroAccuracy:0.####}, a value between 0 and 1, the closer to 1, the better");
            Console.WriteLine($"    MicroAccuracy = {metrics.MicroAccuracy:0.####}, a value between 0 and 1, the closer to 1, the better");
            Console.WriteLine($"    LogLoss = {metrics.LogLoss:0.####}, the closer to 0, the better");
            for (int i = 0; i < metrics.PerClassLogLoss.Count; i++)
            {
                Console.WriteLine($"    LogLoss for class {i + 1} = {metrics.PerClassLogLoss[i]:0.####}, the closer to 0, the better");
            }
            Console.WriteLine($"************************************************************");
        }

        public static void PrintMulticlassClassificationFoldsAverageMetrics(IEnumerable<TrainCatalogBase.CrossValidationResult<MulticlassClassificationMetrics>> crossValResults)
        {
            var metricsInMultipleFolds = crossValResults.Select(r => r.Metrics);

            var microAccuracyValues = metricsInMultipleFolds.Select(m => m.MicroAccuracy);
            var microAccuracyAverage = microAccuracyValues.Average();
            var microAccuraciesStdDeviation = CalculateStandardDeviation(microAccuracyValues);
            var microAccuraciesConfidenceInterval95 = CalculateConfidenceInterval95(microAccuracyValues);

            var macroAccuracyValues = metricsInMultipleFolds.Select(m => m.MacroAccuracy);
            var macroAccuracyAverage = macroAccuracyValues.Average();
            var macroAccuraciesStdDeviation = CalculateStandardDeviation(macroAccuracyValues);
            var macroAccuraciesConfidenceInterval95 = CalculateConfidenceInterval95(macroAccuracyValues);

            var logLossValues = metricsInMultipleFolds.Select(m => m.LogLoss);
            var logLossAverage = logLossValues.Average();
            var logLossStdDeviation = CalculateStandardDeviation(logLossValues);
            var logLossConfidenceInterval95 = CalculateConfidenceInterval95(logLossValues);

            var logLossReductionValues = metricsInMultipleFolds.Select(m => m.LogLossReduction);
            var logLossReductionAverage = logLossReductionValues.Average();
            var logLossReductionStdDeviation = CalculateStandardDeviation(logLossReductionValues);
            var logLossReductionConfidenceInterval95 = CalculateConfidenceInterval95(logLossReductionValues);

            Console.WriteLine($"*************************************************************************************************************");
            Console.WriteLine($"*       Metrics for Multi-class Classification model      ");
            Console.WriteLine($"*------------------------------------------------------------------------------------------------------------");
            Console.WriteLine($"*       Average MicroAccuracy:    {microAccuracyAverage:0.###}  - Standard deviation: ({microAccuraciesStdDeviation:#.###})  - Confidence Interval 95%: ({microAccuraciesConfidenceInterval95:#.###})");
            Console.WriteLine($"*       Average MacroAccuracy:    {macroAccuracyAverage:0.###}  - Standard deviation: ({macroAccuraciesStdDeviation:#.###})  - Confidence Interval 95%: ({macroAccuraciesConfidenceInterval95:#.###})");
            Console.WriteLine($"*       Average LogLoss:          {logLossAverage:#.###}  - Standard deviation: ({logLossStdDeviation:#.###})  - Confidence Interval 95%: ({logLossConfidenceInterval95:#.###})");
            Console.WriteLine($"*       Average LogLossReduction: {logLossReductionAverage:#.###}  - Standard deviation: ({logLossReductionStdDeviation:#.###})  - Confidence Interval 95%: ({logLossReductionConfidenceInterval95:#.###})");
            Console.WriteLine($"*************************************************************************************************************");

        }

        public static double CalculateStandardDeviation(IEnumerable<double> values)
        {
            double average = values.Average();
            double sumOfSquaresOfDifferences = values.Select(val => (val - average) * (val - average)).Sum();
            double standardDeviation = Math.Sqrt(sumOfSquaresOfDifferences / (values.Count() - 1));
            return standardDeviation;
        }

        public static double CalculateConfidenceInterval95(IEnumerable<double> values)
        {
            double confidenceInterval95 = 1.96 * CalculateStandardDeviation(values) / Math.Sqrt((values.Count() - 1));
            return confidenceInterval95;
        }

        private static IEnumerable<ModelInput> GetModelInputs()
        {
            var lines = File.ReadAllLines(TRAIN_DATA_FILEPATH);
            for (var i = 1; i < lines.Length; i++)
            {
                var columns = lines[i].Split(','); //naive csv parsing
                yield return new ModelInput()
                {
                    Label = columns[0],
                    ImageSource = (Bitmap)Bitmap.FromFile(columns[1]), //ouch -- lots of in-memory bitmaps
                };
            }
        }

    }
}

LittleLittleCloud commented 3 years ago

Hi @sarah-graf

The schema mismatch error is from LoadRawImageByte, which input column has to be image path and output will be image bytes. However You don't need LoadRawImageByte transformer in BuildTrainingPipeline because your image has already been loaded and transfer to bytes by GetModelInputs

So your dataProcessPipeline should look like this

var dataProcessPipeline = mlContext.Transforms.Conversion.MapValueToKey("Label", "Label")
                                      .Append(mlContext.Transforms.CopyColumns("Features", "ImageSource"));

Also, CopyColumns is not necessary if you set the feature column in ImageClassification to ImageSource. (or renaming ImageSouce to Features in your ModelInput class). Meanwhile, you might also want to resize your images to 224*224, which I believe is the size which used to pretrain the model in ImageClassification.

var dataProcessPipeline = mlContext.Transforms.Conversion.MapValueToKey("Label", "Label")
                                      .Append(mlContext.Transforms.ResizeImages("ImageSource", "ImageSource", 224, 224))
                                      .Append(mlContext.Transforms.CopyColumns("Features", "ImageSource"));

sarah-graf commented 3 years ago

Hi @LittleLittleCloud

many thanks for your response! I have adjusted the pipeline, but with a small difference because ResizeImages expects a different order of variables (see picture). Which is why I changed .Append(mlContext.Transforms.ResizeImages("ImageSource", "ImageSource", 224, 224)) to .Append(mlContext.Transforms.ResizeImages("ImageSource", 224, 224)).

Screenshot (19)

After the changes I no longer get the old error, but in the same place the new error Schema mismatch for feature column 'Features': expected VarVector <Byte>, got Image <224, 224>.

I also set up another project where I went through the same steps again to make sure the problem wasn't something that I had changed previously and that I forgot to change back again. However, I am currently getting the same error in the same place in the new project.

LittleLittleCloud commented 3 years ago

Oops I forget adding 'ExtractPixels'. It will convert images to byte vectors

var dataProcessPipeline = mlContext.Transforms.Conversion.MapValueToKey("Label", "Label")
                                      .Append(mlContext.Transforms.ResizeImages("ImageSource", 224, 224))
                                      .Append(mlContext.Transforms.ExtractPixels("ImageSource")
                                      .Append(mlContext.Transforms.CopyColumns("Features", "ImageSource"));

sarah-graf commented 3 years ago

Hi @LittleLittleCloud, thank you again for the input! With ExtractPixels I get a Vector where a VarVector is expected, should I pass something additional with ExtractPixels besides the ColumnName? I'm sorry that I'm bothering you again, unfortunately there is little to be found online about this topic.

vzhuqin commented 3 years ago

@LittleLittleCloud Could you provide me with information on how to validate this issue?

LittleLittleCloud commented 3 years ago

Yeah sure, (#851 will have the same validation step)

(kindly remind that web api is broken and I have a fix PR for that, but it's not merged in yet)

azure image classification && object detection

create an azure image classification or object detection experiment, finishing training and validate the following places

evaluate page

everything should work

consume page

everything should work (console, notebook, web api), plus

snippet code

The snippet code should create a ModelInput class that take a bitmap image as input. (see the following example)

//Load sample data
var image = (Bitmap)Image.FromFile(@"C:\Users\xiaoyuz\Desktop\WeatherData2\Cloudy\cloudy1.jpg");
AzureImage.ModelInput sampleData = new AzureImage.ModelInput()
{
    ImageSource = image,
};

//Load model and predict output
var result = AzureImage.Predict(sampleData);

local image classification (cpu/gpu)

evaluation page

everything should work

consume page

everything should work, plus

snippet code

The snippet code should create a ModelInput class that read image as raw byte. (see the following example)

//Load sample data
var imageBytes = File.ReadAllBytes(@"C:\Users\xiaoyuz\Desktop\WeatherData2 - Copy\Cloudy\cloudy1.jpg");
Image.ModelInput sampleData = new Image.ModelInput()
{
    ImageSource = imageBytes,
};

//Load model and predict output
var result = Image.Predict(sampleData);

vzhuqin commented 3 years ago

Verified this issue on latest main: 16.9.1.2156701, finished training without error for Image classification (Local (CPU) & Azure) and Object detection scenarios, and everything work fine, details as below. Image Local: Evaluate:

Consume:

Can add projects to solution and run (ConsoleApp & Notebook) or build (WebApi) them successfully

Image Azure: Evaluate:

Consume:

Can add projects to solution and run (ConsoleApp & Notebook) or build (WebApi) them successfully

Object: Evaluate: Consume:

Can add projects to solution and run (ConsoleApp & Notebook) or build (WebApi) them successfully

Code snippet: copied them as below screenshot.

dotnet / machinelearning-modelbuilder