Closed voges316 closed 3 years ago
Hey @voges316 I am looking into this. I have not had a chance to directly test this myself, but I noticed that the 2 commands you run are different. On the old version you have --label-column-index 1
and the new version --label-col 1
. Can you see what happens if you use the old command in the latest version and let me know?
@michaelgsharp yes, the cli options changed from v0.15 to v16.x. Using the older --label-column-index using v16.2 prints out an error and the help menu.
Option '--label-col' is required.
Unrecognized command or argument '--label-column-index'
Unrecognized command or argument '1'
@voges316 this is interesting. With version 16.2.0 I am also getting the same warning about the header, and label is also a string for me as well, but it runs without errors when I run it. I have attached the solution generated when I run it. Can you test it on your machine and see what happens? If it works, can you compare the differences between my project and yours? In theory they should be the same since its with the same version, but something is obviously different. YelpML16.zip
@michaelgsharp So I tried to import the model in the existing code, and it failed to even import.
# Import line
ITransformer predictionPipeline = mlContext.Model.Load("models/YelpML16.zip", out predictionPipelineSchema);
# Exception
Unhandled exception. System.InvalidOperationException: Could not load legacy format model
---> System.InvalidOperationException: Repository doesn't contain entry DataLoaderModel/Model.key
Here's the full exception if it helps
I also created a repo just to show what I was doing: https://github.com/voges316/MlnetModelImports
And you tried to import that using version 16.2.0 correct?
Yes. mlnet v16.2.0 is what was used to generate the model. I am trying to import the model in dotnet-core, using Microsoft.ML v1.5.4
So using your exact same project (after modying the input/output to match the 16.2.0 version) it loaded and ran the model fine. I am wondering if there are multiple version of ML.NET on your machine that are conflicting with each other somehow. Can you clear our your nuget package folder and then rebuild and test again?
So I deleted .nuget/packages/*, rebooted too for good measure, ran dotnet restore and dotnet run and encountered the same error.
Unhandled exception. System.InvalidOperationException: Can't bind the IDataView column 'PredictedLabel' of type 'String' to field or property 'Prediction' of type 'System.Boolean'.
I actually think thats progress. I had to change the ModelOutput to this:
public class ModelOutput
{
//[ColumnName("PredictedLabel")]
//public bool Prediction { get; set; }
//public float Probability { get; set; }
//public float Score { get; set; }
// This is what 16.2.0 gives me
[ColumnName("PredictedLabel")]
public String Prediction { get; set; }
// Note, the first value in the Score array corresponds to the prediction
public float[] Score { get; set; }
}
That should get you past the current error you are seeing.
You should then see an error on this line:
Console.WriteLine($"Text: {input.Col0} | Prediction: {(Convert.ToBoolean(result.Prediction) ? "Positive" : "Negative")} review | Probability of being positive: {result.Score[0]} ");
result.Prediction
is coming back as "0", and the call to Convert.ToBoolean
isn't happy with that. I changed it to this:
Console.WriteLine($"Text: {input.Col0} | Prediction: {(result.Prediction == "0" ? "Negative" : "Positive" )} review | Probability: {result.Score[0]} ");
Try that out and let me know what happens.
Well that does indeed appear to fix it. So the actual modelinput & modeloutput generated by the mlnet cli tool differed between v0.15.1 and v16.2. That does make sense, but why is it saying columns that are 0/1 are 'strings' instead of booleans? But maybe that's just a code generation issue. Thanks for the help.
# dotnet run
Hello World!
=============== Single Prediction ===============
Text: Meh, food was cold. | Prediction: Negative review | Probability of being positive: 0.00078324333
================End of Process.Hit any key to exit==================================
It says they are strings because it parsed the input label column as a string, so the output label column is as well. I am still looking into why it is doing that, and why it thinks there is a header. Its possible that when those issues are resolved that it will go back to being a boolean value.
Sine your immediate issue has been resolved, I am going to go ahead and close this issue for now. I will shortly create issues to track the header and boolean issue. If you have more issues please feel free to either re-open this ticket (if its the same issue) or create a new one as needed.
System information
Issue
Upgrading from the mlnet cli tool from v0.15.1 to v16.2 I am unable to import a generated model using the C# api that is generated using the most recent mlnet cli tool on a yelp sentiment dataset.
What did you do? Tried to import a model generated from mlnet cli v16.2 and predict an input.
What happened?
I get the following error
Source code / logs
Previous mlnet cli tool: Mlnet --version => 0.15.28007.4 Yelp labelled datset from here http://archive.ics.uci.edu/ml/machine-learning-databases/00331/sentiment%20labelled%20sentences.zip
Previous command to train a model on a dataset: mlnet auto-train --task binary-classification --dataset data/yelp_labelled.txt --label-column-index 1 --has-header false --max-exploration-time 30 --name YelpDemo
Now I can load that model into a console app and run it fine. Showing the ModelInput.cs generated by the mlnet cli tool. I can just create another input class like this and use it to load the model, create a prediction engine, and score a new input.
However, if I upgrade the mlnet cli tool it doesn't work mlnet –version => 16.2.0 Command to train a model: mlnet classification --dataset data/yelp_labelled.txt --has-header false --label-col 1 --train-time 30 --name YelpML16 The tool appears to pring a warning about a header being detected in the dataset, even though no header is used in this dataset.
When I try and import the model and create a prediction engine I encounter the error I pasted above, saying: Unhandled exception. System.ArgumentOutOfRangeException: Could not find input column 'col1' (Parameter 'inputSchema')
Diving into the mlnet v16.2 classes, it appears the ModelInput.cs has changed. The second column is no longer a boolean Label, but a string Col1
This is different than the ModelInput.cs generated by v0.15.1, but if I change the modelinput class to match the mlnet cli output I still get an exception loading the model and creating the prediction engine.