Open RokoToken opened 4 years ago
@RokoToken thank you for reporting this. Could you share the model.zip and small subset of TrainingData.csv for us to repro this issue. thx
Modified Titanic CSV Dataset
survived,sex,class,deck,embark_town,alone
TRUE,male,Third,unknown,Southampton,n
TRUE,female,First,C,Cherbourg,n
TRUE,female,Second,unknown,Southampton,y
MLNet CLI Command:
mlnet auto-train --task multiclass-classification --dataset "titanic.csv" --label-column-name "class"
Nimbus Code:
from nimbusml import Pipeline, FileDataStream
dataset = FileDataStream.read_csv('titanic.csv')
pipeline = Pipeline()
pipeline.load_model("MLModel.zip")
scores = pipeline.predict(dataset, y='class', evaltype='binary')
print(scores)
Error:
Error: *** System.ArgumentOutOfRangeException: 'Could not find label column 'PredictedLabel'
There was a similar issue 6mo ago -- https://github.com/microsoft/NimbusML/issues/201 -- We were fixing NimbusML scoring of models trained in the AutoML.NET CLI.
@RokoToken: Can you post your MLModel.zip? Also, which version of the CLI are you using? mlnet --version
@justinormont @ganik mlnet version = 0.15.28007.4 @BuiltBy: dlab14-DDVSOWINAGE054 MLModel.zip
Is there a workaround for this? Should I use an older version of MLNet CLI? Is there a way to modify the output column through the Nimbus pipeline? Something like:
from nimbusml import Pipeline, FileDataStream
dataset = FileDataStream.read_csv('titanic.csv')
pipeline = Pipeline( add_output_column=PredictedLabel )
pipeline.load_model("MLModel.zip")
scores = pipeline.predict(dataset, y='class', evaltype='binary')
print(scores)
@RokoToken, the workaround will be to find the pipeline params from AutoML.NET and re-train same pipeline using either just ML.NET or NimbusML. Also can you try using pipeline.score(...)
@ganik: Do you see anything odd with the posted model?
@RokoToken: I would expect that the AutoML․NET CLI is producing a normal ML․NET model. Your current version is the newest released version.
You can also re-train your model from the generated code which the CLI produced. You can uncomment the line ModelBuilder.CreateModel()
, and run the project. You can also update the project requirements, as the codegen references an older version of ML․NET.
@RokoToken sorry for delay, could you share pls titanic.csv file. The model does look ok, so it should work. thx
I was able to debug through and get scoring after few fixes in NimbusML python code (not ML.NET). However return scores are NaN. Script: `from nimbusml import Pipeline, FileDataStream
dataset = FileDataStream.read_csv('E:/sources/tmp/titanic.csv') print(dataset.head(3))
pipeline = Pipeline() pipeline.load_model("E:/sources/tmp/MLModel.zip") scores = pipeline.predict(dataset) print(scores.head(3))`
and output:
@justinormont Could you see if you can score this in ML.NET. I am not getting any scores from this model.
I used this csv test file below:
survived,sex,class,deck,embark_town,alone TRUE,male,Third,unknown,Southampton,n TRUE,female,First,C,Cherbourg,n TRUE,female,Second,unknown,Southampton,y
Describe the bug When using the mlnet auto-train tool to create a model, and then load that model using NimbusML, an exception is being thrown.
To Reproduce Steps to reproduce the behavior:
Expected behavior Loading and scoring the model should work as expected.
Actual behavior You get an exception and scoring is not completed:
Desktop (please complete the following information):
Additional Context