dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
9.04k stars 1.88k forks source link

Probability is missing from the prediction Output schema of BinaryClassification.Trainers.AveragedPerceptron #3299

Closed bsoman3 closed 5 years ago

bsoman3 commented 5 years ago

Issue

Using the Averaged Perceptron Binary Classifier in the Pipeline:

var pipeline =
//Other things in the pipeline
.Append(mlContext.BinaryClassification.Trainers.AveragedPerceptron(learningRate: lr, numIterations: 5));

//Fit Model steps
//Save Model steps
//Load Model steps

var predictions = loadedModel.Transform(data);
var metrics = mlContextTest.BinaryClassification.Evaluate(predictions);

Leads to the following error.

'Probability column 'Probability' not found
Parameter name: schema'

What Happened

Looking at the outputSchema of the predictions IDataView, the probability column is absent.

Expected Behavior

The Probability column should be available in the predictions IDataView based on reply by @zeahmed here- https://github.com/dotnet/machinelearning/issues/376#issuecomment-399282699

Looking at other binary classifiers like fast tree, that column is present in their output schema.

System information

wschin commented 5 years ago

AveragedPerceptron doesn't have probability output due to its math properties. To get probability, you need to add a calibrator. Please see https://github.com/dotnet/machinelearning/blob/29ca1f8dc4c9c15076d7b858490b52d08ae979c8/docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/BinaryClassification/Calibrators/Platt.cs#L42 where it first trains AveragedPerceptron and then adds a calibrator.

rogancarr commented 5 years ago

You can also use your code as is, but evaluate with the EvaluateNonCalibrated evaluator.

mlContext.BinaryClassification.EvaluateNonCalibrated()

This is the preferred route if you don't actually need a probability or want probabilistic evaluation metrics.

bsoman3 commented 5 years ago

Makes sense! thanks for the input! This can be closed afaik