dotnet / machinelearning-modelbuilder

Simple UI tool to build custom machine learning models.
Creative Commons Attribution 4.0 International
265 stars 56 forks source link

More Friendly Errors #622

Closed luisquintanilla closed 4 years ago

luisquintanilla commented 4 years ago

System Information (please complete the following information):

Describe the bug

When trying to run classification on a file with no header, not specifying has-header value and providing the column index displays the whole stack trace. It would be nice to display a more friendly error instead of the stack trace. Add the stack trace to the log and point the user to it if they want more details.

Dataset:

Wow... Loved this place.    1
Crust is not good.  0
Not tasty and the texture was just nasty.   0

Command:

mlnet classification --name classification --dataset yelp_labelled.txt --label-col 4

Stack Trace / Error Message:

Unhandled exception: System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection.
Parameter name: index
   at System.Collections.Generic.List`1.get_Item(Int32 index)
   --- End of inner exception stack trace ---
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor, Boolean wrapExceptions)
   at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
   at System.Reflection.RuntimePropertyInfo.GetValue(Object obj, Object[] index)
   at Microsoft.ML.ModelBuilder.AutoMLService.Contract.AutoMLServiceParamater..ctor(ILocalAutoMLTrainParameters paramater) in /_/src/Microsoft.ML.ModelBuilder.AutoMLService.Contract/AutoMLServiceParamater.cs:line 19
   at Microsoft.ML.CLI.Program.<>c__DisplayClass1_0.<Main>b__0(ClassificationCommand option) in /_/src/mlnet/Program.cs:line 64
   --- End of inner exception stack trace ---
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor, Boolean wrapExceptions)
   at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
   at System.Delegate.DynamicInvokeImpl(Object[] args)
   at System.CommandLine.Invocation.ModelBindingCommandHandler.InvokeAsync(InvocationContext context)
   at System.CommandLine.Invocation.InvocationPipeline.<>c__DisplayClass4_0.<<BuildInvocationChain>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c.<<UseParseErrorReporting>b__19_0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass14_0.<<UseHelp>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass22_0.<<UseVersionOption>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass21_0.<<UseTypoCorrections>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c.<<UseSuggestDirective>b__20_0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c.<<UseParseDirective>b__18_0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c.<<UseDebugDirective>b__10_0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c.<<RegisterWithDotnetSuggest>b__9_0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass12_0.<<UseExceptionHandler>b__0>d.MoveNext()
briacht commented 4 years ago

I think your error was that you chose col index of 4 when there's only 2 columns in the dataset (but yes that should have friendlier error).

briacht commented 4 years ago

Keep existing: • If a user selects a dataset that doesn’t exist: File or Directory does not exist: dataset.txt • If a user selects a dataset that has the right extension but is not formatted correctly: One or more errors occurred. (Unable to split the file provided into multiple, consistent columns.)

Modify existing: • For --label-col (name), if user selects column “X” that does not exist in the dataset: One or more errors occurred. (Specified label column 'X' does not exist in the dataset.) • For --ignore-cols, if user selects column “X” that does not exist in the dataset: One or more errors occurred. (Specified column “X” does not exist in the dataset.) • For --ignore-cols, if user selects column index X that does not exist in the dataset: One or more errors occurred. (Specified column index X is out of range. Must be non-negative and less than the size of the collection.) • If user selects --output and folder name has incorrect characters / syntax (e.g. --output ?dataset), user gets error before training starts (e.g. training will not start until they fix output folder name): One or more errors occurred. (Specified output folder name syntax is incorrect.)

New: • For classification and regression, if user selects a tabular file that is not a .txt, .tsv, or .csv for their training, testing, or validation dataset. One or more errors occurred. (File type not supported. File must be .csv, .tsv, or .txt format.) • For image classification, if user selects folder without sub-folders of images: One or more errors occurred. (Data is not in correct format. Selected folder should have labelled sub-directories containing images for classification.) • For --label-col, if user selects column index X that is out of bounds in the dataset: One or more errors occurred. (Specified label column index X is out of range. Must be non-negative and less than the size of the collection.) • If user indicates --has-header false, and they try to input a column name (instead of an index): One or more errors occurred. (You specified --has-header false and inputted a column name for --label-col. Please input a column index or change to --has-header true). • If user tries to train on dataset X that is open in Excel, Notepad, etc.: One or more errors occurred. (Your dataset X is being used by another process. Please close and try training again.)