dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
9.01k stars 1.88k forks source link

How to predict integer values (seven segments display output) using ML.NET? #226

Closed Rowandish closed 5 years ago

Rowandish commented 6 years ago

System information

Question

I'm looking at a the cs file here: https://www.microsoft.com/net/learn/apps/machine-learning-and-ai/ml-dotnet/get-started/windows and everything works well.

Now I'd like to improve the example: I'd like to predict a number-only data set and not a number-string dataset, for example predict the ouput of a seven segments display.

Here is my super easy dataset, the last column is the int number that I want to predict:

1,0,1,1,1,1,1,0
0,0,0,0,0,1,1,1
1,1,1,0,1,1,0,2
1,1,1,0,0,1,1,3
0,1,0,1,0,1,1,4
1,1,1,1,0,0,1,5
1,1,1,1,1,0,1,6
1,0,0,0,0,1,1,7
1,1,1,1,1,1,1,8
1,1,1,1,0,1,1,9

And here is my test code:

public class Digit
{
    [Column("0")] public float Up;

    [Column("1")] public float Middle;

    [Column("2")] public float Bottom;

    [Column("3")] public float UpLeft;
    [Column("4")] public float BottomLeft;
    [Column("5")] public float TopRight;
    [Column("6")] public float BottomRight;

    [Column("7")] [ColumnName("Label")] public float Label;
}

public class DigitPrediction
{
    [[ColumnName("Score")] public float[] Score;
}

public PredictDigit()
{
    var pipeline = new LearningPipeline();
    var dataPath = Path.Combine("Segmenti", "segments.txt");
    pipeline.Add(new TextLoader<Digit>(dataPath, false, ","));
    pipeline.Add(new ColumnConcatenator("Label", "DigitValue"));
    pipeline.Add(new ColumnConcatenator("Features", "Up", "Middle", "Bottom", "UpLeft", "BottomLeft", "TopRight", "BottomRight"));
    pipeline.Add(new StochasticDualCoordinateAscentClassifier());
    var model = pipeline.Train<Digit, DigitPrediction>();
    var prediction = model.Predict(new Digit
    {
        Up = 1,
        Middle = 1,
        Bottom = 1,
        UpLeft = 1,
        BottomLeft = 1,
        TopRight = 1,
        BottomRight = 1,
    });

    Console.WriteLine($"Predicted digit is: {prediction.Score}");
    Console.ReadLine();
}

Now the system "works" and I got as prediction.Score a Single[] value where the index associated with the higher value is the predicted value. How to obtain exactly the value? Is it the right approach?

Ivanidzo4ka commented 6 years ago
 public class DigitPrediction
        {
            [ColumnName("PredictedLabel")]
            public uint ExpectedDigit;

            [ColumnName("Score")]
            public float[] Score;
        }

This will almost do the trick. You will get expected digit in ExpectedDigit except it will be offset by one (I'll fill separate issue about that). Scores just give you "probability" of expected class, higher score - higher chance it belong to this class.

glebuk commented 6 years ago

One more suggestion. If you consider having such data with many more numerical columns, your code and performance will be significantly improved if you declare a vector feature column, such as:

public class Digit
{
    public float Up { get =>  Features[0];  set => Features[0] = value; }
    ...

    [Column("0-6")] [VectorType(7)] public float[] Features;
    [Column("7")] [ColumnName("Label")] public float Label;
}
dotnetKyle commented 6 years ago

This question was originally asked on StackOverflow: https://stackoverflow.com/questions/50497593/how-to-predict-integer-values-using-ml-net

Rowandish commented 6 years ago

Thanks for all the suggestion, the @Ivanidzo4ka suggestion do the trick. I also implement @glebuk improvements and they works. Now I'll answer to my question in Stackoverflow posting the complete example.

Ivanidzo4ka commented 5 years ago

DRI RESPONE: This thread looks like answered one, and I plan to close issue withing next few days.