Closed jamiefutch closed 3 years ago
@jamiefutch Sorry you've experienced a problem and thanks for reporting! Just to clarify, is your data separated by tabs or spaces? Is the separator between 863
and i love going here
a full tab?
@beccamc Sorry, left that out. The values are tab separated. The offending line/string is: 863 i love going here 1 e.g. (escaped): 863\ti love going here\t1
After the refactor we still have an error with this dataset
I need to dig into this further.
System Information (please complete the following information):
Describe the bug AutoML binary classification experiment fails when the following text is a feature column: i love going here
To Reproduce Steps to reproduce the behavior:
use this data as data file: Id text class 863 i love going here 1 794 excellent 1 802 good 1 805 good contacts 1 806 awesome 1 807 good 1 808 good 1 809 good 1 810 good 1 811 love new location 1 813 nice professional 1 814 new facility nice 1 817 very good 1 818 very good 1 819 very good 1 830 bad 0 840 stupid person 0
AutoML -> Text Classification columns: id: ignore text: feature class: label
predict class column
See error:
Training failed with the exception: System.ArgumentOutOfRangeException: Could not find feature column 'Features' Parameter name: inputSchema at Microsoft.ML.Trainers.TrainerEstimatorBaseb_5() in //src/Microsoft.ML.ModelBuilder.AutoMLService/Experiments/AutoMLExperiment.cs:line 81
at System.Threading.Tasks.Taskd 21.MoveNext() in /_/src/Microsoft.ML.ModelBuilder.AutoMLService/Experiments/AutoMLExperiment.cs:line 108
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.ML.ModelBuilder.AutoMLEngine.d_30.MoveNext() in //src/Microsoft.ML.ModelBuilder.AutoMLService/AutoMLEngineService/AutoMLEngine.cs:line 147
2.CheckInputSchema(SchemaShape inputSchema) at Microsoft.ML.Trainers.TrainerEstimatorBase
2.GetOutputSchema(SchemaShape inputSchema) at Microsoft.ML.Data.EstimatorChain1.GetOutputSchema(SchemaShape inputSchema) at Microsoft.ML.Data.EstimatorChain
1.GetOutputSchema(SchemaShape inputSchema) at Microsoft.ML.Data.EstimatorChain1.Fit(IDataView input) at Microsoft.ML.AutoML.RunnerUtil.TrainAndScorePipeline[TMetrics](MLContext context, SuggestedPipeline pipeline, IDataView trainData, IDataView validData, String labelColumn, IMetricsAgent
1 metricsAgent, ITransformer preprocessorTransform, FileInfo modelFileInfo, DataViewSchema modelInputSchema, IChannel logger) at Microsoft.ML.AutoML.Experiment2.Execute() at Microsoft.ML.AutoML.ExperimentBase
2.Execute(ColumnInformation columnInfo, DatasetColumnInfo[] columns, IEstimator1 preFeaturizer, IProgress
1 progressHandler, IRunner1 runner) at Microsoft.ML.AutoML.ExperimentBase
2.ExecuteCrossValSummary(IDataView[] trainDatasets, ColumnInformation columnInfo, IDataView[] validationDatasets, IEstimator1 preFeaturizer, IProgress
1 progressHandler) at Microsoft.ML.AutoML.ExperimentBase2.Execute(IDataView trainData, ColumnInformation columnInformation, IEstimator
1 preFeaturizer, IProgress1 progressHandler) at Microsoft.ML.ModelBuilder.AutoMLService.Experiments.AutoMLExperiment
3.<>cDisplayClass21_0.1.InnerInvoke() at System.Threading.Tasks.Task.Execute() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.ML.ModelBuilder.AutoMLService.Experiments.AutoMLExperiment
3.Expected behavior Model is trained and tested
Screenshots N/A
Additional context Log: 2020-07-21 18:18:00.7024 DEBUG Disposing TrainSession (Microsoft.ML.ModelBuilder.Utils.Logger.Debug) 2020-07-21 18:18:00.7024 DEBUG Disposing AutoMLService Client (Microsoft.ML.ModelBuilder.Utils.Logger.Debug) 2020-07-21 18:18:00.7044 DEBUG Disposing TrainSession (Microsoft.ML.ModelBuilder.Utils.Logger.Debug) 2020-07-21 18:18:00.7274 INFO | Trainer MicroAccuracy MacroAccuracy Duration #Iteration | (Microsoft.ML.ModelBuilder.Utils.Logger.Info) 2020-07-21 18:18:00.7614 INFO Could not find input column 'Features' Parameter name: inputSchema (Microsoft.ML.ModelBuilder.Utils.Logger.Info) 2020-07-21 18:18:00.7614 INFO Could not find input column 'Features' Parameter name: inputSchema (Microsoft.ML.ModelBuilder.Utils.Logger.Info) 2020-07-21 18:18:00.7724 INFO Could not find feature column 'Features' Parameter name: inputSchema (Microsoft.ML.ModelBuilder.Utils.Logger.Info) 2020-07-21 18:18:00.7724 DEBUG Training failed with the exception: System.ArgumentOutOfRangeException: Could not find feature column 'Features' Parameter name: inputSchema at Microsoft.ML.Trainers.TrainerEstimatorBaseb_5() in //src/Microsoft.ML.ModelBuilder.AutoMLService/Experiments/AutoMLExperiment.cs:line 81
at System.Threading.Tasks.Taskd 21.MoveNext() in /_/src/Microsoft.ML.ModelBuilder.AutoMLService/Experiments/AutoMLExperiment.cs:line 108
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.ML.ModelBuilder.AutoMLEngine.d_30.MoveNext() in //src/Microsoft.ML.ModelBuilder.AutoMLService/AutoMLEngineService/AutoMLEngine.cs:line 147 (Microsoft.ML.ModelBuilder.Utils.Logger.Debug)
2.CheckInputSchema(SchemaShape inputSchema) at Microsoft.ML.Trainers.TrainerEstimatorBase
2.GetOutputSchema(SchemaShape inputSchema) at Microsoft.ML.Data.EstimatorChain1.GetOutputSchema(SchemaShape inputSchema) at Microsoft.ML.Data.EstimatorChain
1.GetOutputSchema(SchemaShape inputSchema) at Microsoft.ML.Data.EstimatorChain1.Fit(IDataView input) at Microsoft.ML.AutoML.RunnerUtil.TrainAndScorePipeline[TMetrics](MLContext context, SuggestedPipeline pipeline, IDataView trainData, IDataView validData, String labelColumn, IMetricsAgent
1 metricsAgent, ITransformer preprocessorTransform, FileInfo modelFileInfo, DataViewSchema modelInputSchema, IChannel logger) at Microsoft.ML.AutoML.Experiment2.Execute() at Microsoft.ML.AutoML.ExperimentBase
2.Execute(ColumnInformation columnInfo, DatasetColumnInfo[] columns, IEstimator1 preFeaturizer, IProgress
1 progressHandler, IRunner1 runner) at Microsoft.ML.AutoML.ExperimentBase
2.ExecuteCrossValSummary(IDataView[] trainDatasets, ColumnInformation columnInfo, IDataView[] validationDatasets, IEstimator1 preFeaturizer, IProgress
1 progressHandler) at Microsoft.ML.AutoML.ExperimentBase2.Execute(IDataView trainData, ColumnInformation columnInformation, IEstimator
1 preFeaturizer, IProgress1 progressHandler) at Microsoft.ML.ModelBuilder.AutoMLService.Experiments.AutoMLExperiment
3.<>cDisplayClass21_0.1.InnerInvoke() at System.Threading.Tasks.Task.Execute() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.ML.ModelBuilder.AutoMLService.Experiments.AutoMLExperiment
3.