dotnet / machinelearning-modelbuilder

Simple UI tool to build custom machine learning models.
Creative Commons Attribution 4.0 International
265 stars 56 forks source link

Exception while auto-train with regression task #363

Closed smigielns closed 4 years ago

smigielns commented 4 years ago

Hello

I would like to report a below error. I have tried also to run it using Visual Studio and ML.NET Model Builder but the result was the same.

obraz

debug_log.txt

LittleLittleCloud commented 4 years ago

Can we have your datasets, it should be something wrong in automl

smigielns commented 4 years ago

Yes, of course.

"Price" is a column to predict. testujemy.zip

I'm not familiar with ML. I have ran the test just to see how it works so data file may be also not correct.

I hope that helps.

LittleLittleCloud commented 4 years ago

I've run the experiment for several time and can't reproduce the error... Could you also try re-run that experiment and see if that error occur? and BTW could I have the version of your mlnet cli, and the test file ".\test_walidaca..csv"? Thanks!

smigielns commented 4 years ago

test_walidacja.zip

My mlnet version is: 0.15.28007.4 @BuiltBy: dlab14-DDVSOWINAGE054 @Branch: features/automl @SrcCode: https://github.com/dotnet/machinelearning/tree/dc9a9b7ffcaf636541fe997c59f3bfdda57501e5+dc9a9b7ffcaf636541fe997c59f3bfdda57501e5

I have run it also using Visual Studio and the error occurs when I increase the 'Time to train' value. It stops after 8th iteration - the same as when I run it with CLI, the log file you can find in the first message in this conversation.

In below example I used only 'testujemy.csv' file.

ml_error

Have a good day!

JakeRadMSFT commented 4 years ago

@smigielns thanks for the added information. We'll try to reproduce the issue.

LittleLittleCloud commented 4 years ago

I can reproduce the bug you encountered now

LittleLittleCloud commented 4 years ago

It's a bug in AutoML Sweepable parameters, in SweepableParamAttributes, if valueText is "0,034456" (represented in Russian format) or something, it will be parsed into 34456 instead of 0.034456, If that parameter is significant ( like learning_rate), such error will cause unexpected training result ( r-square be -inf). But it won't break pipeline because it's not an exception. But in the next handling training result step, it will throw an exception because there's no legitimate training result and the bestFoldIndex will be -1.

LittleLittleCloud commented 4 years ago

This will be fixed in the next release, regards