Closed robinmohseni closed 5 years ago
Windows 10 Microsoft.ML (1.2.0) Microsoft.ML.LightGbm (1.2.0) .NET Core 2.2
After training a lightgbm model, the model is producing multiclass scores between [0, 1] which totals 1, as expected.
However, after saving the model, then loading it into a new trainedModel object - the scores are now not probabilities, but decimal values.
I have tested the saving and loading with other model types and I cannot replicate the results. It is only the case with the lightgbm model.
Please advise. I am now attempting to rollback library versions to see if it's still an issue
Before saving model... 0.003305528 0.01293249 0.01907223 0.9646355 5.421485E-05 3.556848E-08
After saving model... -3.623514 -2.259367 -1.870877 2.05264 -7.733911 -15.06316
Source code
mlContext.Model.Save(trainedModel, dataView.Schema, _modelPath);
// Save Data Prep transformer //mlContext.Model.Save(pipeline, dataView.Schema, "data_preparation_pipeline.zip"); schema = dataView.Schema; Console.WriteLine("Before saving model..."); TestModelOutput(mlContext, trainedModel); // Load trained model trainedModel = mlContext.Model.Load(_modelPath, out schema); //trainedModel = mlContext.Model.LoadWithDataLoader() Console.WriteLine("After saving model..."); TestModelOutput(mlContext, trainedModel);
private static void TestModelOutput(MLContext mlContext, ITransformer model) { IDataView batchData = mlContext.Data.LoadFromEnumerable(testActions);
IDataView predictions = model.Transform(batchData); IEnumerable<PredictionData> predictedResults = mlContext.Data .CreateEnumerable<PredictionData>(predictions, reuseRowObject: false); foreach (var item in predictedResults) { foreach (var score in item.Score) { Console.WriteLine(score); } }
}
https://github.com/dotnet/machinelearning/issues/3647
If you do the softmax transformation (exp(x)/sum(exp(x)) then I can replicate the desired results. obviously not the ideal workaround.
System information
Windows 10 Microsoft.ML (1.2.0) Microsoft.ML.LightGbm (1.2.0) .NET Core 2.2
Issue
After training a lightgbm model, the model is producing multiclass scores between [0, 1] which totals 1, as expected.
However, after saving the model, then loading it into a new trainedModel object - the scores are now not probabilities, but decimal values.
I have tested the saving and loading with other model types and I cannot replicate the results. It is only the case with the lightgbm model.
Please advise. I am now attempting to rollback library versions to see if it's still an issue
Source code / logs
Before saving model... 0.003305528 0.01293249 0.01907223 0.9646355 5.421485E-05 3.556848E-08
After saving model... -3.623514 -2.259367 -1.870877 2.05264 -7.733911 -15.06316
Source code
mlContext.Model.Save(trainedModel, dataView.Schema, _modelPath);
private static void TestModelOutput(MLContext mlContext, ITransformer model) { IDataView batchData = mlContext.Data.LoadFromEnumerable(testActions);
}