dotnet / machinelearning-modelbuilder

Simple UI tool to build custom machine learning models.
Creative Commons Attribution 4.0 International
264 stars 56 forks source link

The output of running WebAPI is confusing. #1892

Open vzhuqin opened 2 years ago

vzhuqin commented 2 years ago

System Information (please complete the following information):

Describe the bug

To Reproduce Steps to reproduce the behavior:

  1. Select Create a new project from the Visual Studio 2019 start window;
  2. Choose the C# Console App (.NET Core) project template with .Net 5.0;
  3. Add model builder by right click on the project;
  4. Select Data classification to complete training;
  5. Add WebAPI project to solution on Consume page;
  6. Run the WebAPI project and use powershell to run it;
  7. See the output is very confusing.

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem. image

Additional context Add any other context about the problem here.

LittleLittleCloud commented 2 years ago

This is because we uses schema to generate model output class now, so it includes all columns rather than just score. And the confusing value is encoded data/vector from model output.

I personally prefer this kind of output because it accurately shows what you get from model. @JakeRadMSFT @briacht thoughts?

beccamc commented 2 years ago

Jake will take a look at this strange output. What is in the project we generate? Doesn't seem right.

JakeRadMSFT commented 2 years ago

This is because we uses schema to generate model output class now, so it includes all columns rather than just score. And the confusing value is encoded data/vector from model output.

I personally prefer this kind of output because it accurately shows what you get from model. @JakeRadMSFT @briacht thoughts?

I oddly don't remember that change ... I would have expected just the output.

@michaelgsharp has a way to get just the output columns ... I think ... from schema.

michaelgsharp commented 2 years ago

This will figure out the Score, Probability, and PredictedLabel columns.

var outputSchema = model.GetOutputSchema(data.Schema);
var columnType = default(ReadOnlyMemory<char>);
DataViewSchema.Column scoreColumn = default;
DataViewSchema.Column predictedLabelColumn = default;
DataViewSchema.Column probabilityColumn = default;

foreach (var column in oSchema.Reverse())
{
                try
                {
                                // This will throw an error if the annotation isn't found
                                column.Annotations.GetValue("ScoreValueKind", ref columnType);
                                // We only hit this code if the error isn't thrown.
                                if (columnType.ToString() == "Score")
                                                scoreColumn = column;
                                if (columnType.ToString() == "PredictedLabel")
                                                predictedLabelColumn = column;
                                if (columnType.ToString() == "Probability")
                                                probabilityColumn = column;
                }
                catch (InvalidOperationException)
                {
                                // Don't do anything here. This is just so it will continue 
                }
}
michaelgsharp commented 2 years ago

If it doesn't, it means that some of our trainers aren't annotating the columns correctly, and thats a bug we would need to fix.