dotnet / machinelearning-modelbuilder

Simple UI tool to build custom machine learning models.
Creative Commons Attribution 4.0 International
265 stars 56 forks source link

Error occured while retreiving best pipeline - NullReferenceException #671

Closed JoeMayo closed 4 years ago

JoeMayo commented 5 years ago

System information

Issue

Source code / logs

Please paste or attach the code or logs or traces that would be helpful to diagnose the issue you are reporting.

Results from Output window:

Welcome to the ML.NET CLI!

Learn more about ML.NET CLI: https://aka.ms/mlnet-cli Use 'mlnet --help' to see available commands or visit: https://aka.ms/mlnet-cli-docs

Telemetry

The ML.NET CLI tool collects usage data in order to help us improve your experience. The data is anonymous and doesn't include personal information or data from your datasets.

Read more about ML.NET CLI Tool telemetry: https://aka.ms/mlnet-cli-telemetry

You can opt-out of telemetry by setting the MLDOTNET_CLI_TELEMETRY_OPTOUT environment variable to '1' or 'true' using your favorite shell. Inferring Columns ... Creating Data loader ... Loading data ... Exploring multiple ML algorithms and settings to find you the best model for ML task: binary-classification For further learning check: https://aka.ms/mlnet-cli | Trainer Accuracy AUC AUPRC F1-score Duration #Iteration | [Source=AutoML, Kind=Trace] Channel started [Source=AutoML, Kind=Trace] Evaluating pipeline xf=OneHotEncoding{ col=NO_QUERY:NO_QUERY} xf=TextFeaturizing{ col=Mon_Apr_06_22_19_45_PDT_2009_tf:Mon_Apr_06_22_19_45_PDT_2009} xf=TextFeaturizing{ col=_switchfoot_http_twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D_tf:_switchfoot_http___twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D} xf=ColumnConcatenating{ col=Features:NO_QUERY,Mon_Apr_06_22_19_45_PDT_2009_tf,_switchfoot_http___twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D_tf,1467810369} xf=Normalizing{ col=Features:Features} tr=AveragedPerceptronBinary{} cache=+ [Source=AutoML, Kind=Error] Pipeline crashed: xf=OneHotEncoding{ col=NO_QUERY:NO_QUERY} xf=TextFeaturizing{ col=Mon_Apr_06_22_19_45_PDT_2009_tf:Mon_Apr_06_22_19_45_PDT_2009} xf=TextFeaturizing{ col=_switchfoothttptwitpic_com_2y1zl_Awwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_doitD_tf:_switchfoot_http_twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D} xf=ColumnConcatenating{ col=Features:NO_QUERY,Mon_Apr_06_22_19_45_PDT_2009_tf,_switchfoot_http___twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D_tf,1467810369} xf=Normalizing{ col=Features:Features} tr=AveragedPerceptronBinary{} cache=+ . Exception: System.InvalidOperationException: Event we were waiting on was subject to an exception ---> System.InvalidOperationException: Splitter/consolidator worker encountered exception while consuming source data ---> System.InvalidOperationException: Splitter/consolidator worker encountered exception while consuming source data ---> System.FormatException: Parsing failed with an exception: Could not parse value 4 in line 800001, column 0 ---> System.InvalidOperationException: Could not parse value 4 in line 800001, column 0 at Microsoft.ML.Data.TextLoader.Parser.ProcessOne(FieldSet vs, ColInfo info, ColumnPipe v, Int32 irow, Int64 line) at Microsoft.ML.Data.TextLoader.Parser.ProcessItems(RowSet rows, Int32 irow, Boolean[] active, FieldSet fields, Int32 srcLim, Int64 line) at Microsoft.ML.Data.TextLoader.Parser.ParseRow(RowSet rows, Int32 irow, Helper helper, Boolean[] active, String path, Int64 line, String text) at Microsoft.ML.Data.TextLoader.Cursor.ParallelState.Parse(Int32 tid) at Microsoft.ML.Data.TextLoader.Cursor.ParallelState.ThreadProc(Object obj) at Microsoft.ML.Data.TextLoader.Cursor.ParseParallel(ParallelState state)+MoveNext() at Microsoft.ML.Data.TextLoader.Cursor.MoveNextCore() --- End of inner exception stack trace --- at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.LinkedRowFilterCursorBase.MoveNextCore() --- End of inner exception stack trace --- at Microsoft.ML.Data.DataViewUtils.Splitter.Batch.SetAll(OutPipe[] pipes) at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.DataViewUtils.Splitter.Cursor.MoveNextCore() at Microsoft.ML.Data.DataViewUtils.Splitter.<>cDisplayClass5_1.b2() at Microsoft.ML.Data.RootCursorBase.MoveNext() --- End of inner exception stack trace --- at Microsoft.ML.Data.DataViewUtils.Splitter.Batch.SetAll(OutPipe[] pipes) at Microsoft.ML.Data.DataViewUtils.Splitter.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.CacheDataView.Filler(DataViewRowCursor cursor, ColumnCache[] caches, OrderedWaiter waiter) --- End of inner exception stack trace --- at Microsoft.ML.Data.CacheDataView.GetPermutationOrNull(Random rand) at Microsoft.ML.Data.CacheDataView.GetRowCursorWaiterCore[TWaiter](TWaiter waiter, Func2 predicate, Random rand) at Microsoft.ML.Data.CacheDataView.GetRowCursor(IEnumerable1 columnsNeeded, Random rand) at Microsoft.ML.Trainers.TrainingCursorBase.FactoryBase1.Create(Random rand, Int32[] extraCols) at Microsoft.ML.Trainers.OnlineLinearTrainer2.TrainCore(IChannel ch, RoleMappedData data, TrainStateBase state) at Microsoft.ML.Trainers.OnlineLinearTrainer2.TrainModelCore(TrainContext context) at Microsoft.ML.Trainers.TrainerEstimatorBase2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor) at Microsoft.ML.Data.EstimatorChain1.Fit(IDataView input) at Microsoft.ML.AutoML.RunnerUtil.TrainAndScorePipeline[TMetrics](MLContext context, SuggestedPipeline pipeline, IDataView trainData, IDataView validData, String labelColumn, IMetricsAgent1 metricsAgent, ITransformer preprocessorTransform, FileInfo modelFileInfo, DataViewSchema modelInputSchema, AutoMLLogger logger) at Microsoft.ML.Internal.Utilities.OrderedWaiter.Wait(Int64 position, CancellationToken token) at Microsoft.ML.Data.DataViewUtils.Splitter.<>cDisplayClass9_0.b1() [Source=AutoML, Kind=Trace] 1 NaN 00:01:57.8357040 xf=OneHotEncoding{ col=NO_QUERY:NO_QUERY} xf=TextFeaturizing{ col=Mon_Apr_06_22_19_45_PDT_2009_tf:Mon_Apr_06_22_19_45_PDT_2009} xf=TextFeaturizing{ col=_switchfoothttptwitpic_com_2y1zl_Awwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_doitD_tf:_switchfoot_http_twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D} xf=ColumnConcatenating{ col=Features:NO_QUERY,Mon_Apr_06_22_19_45_PDT_2009_tf,_switchfoot_http___twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D_tf,1467810369} xf=Normalizing{ col=Features:Features} tr=AveragedPerceptronBinary{} cache=+ |1 AveragedPerceptronBinary NaN NaN NaN NaN 117.8 0 | System.InvalidOperationException: Event we were waiting on was subject to an exception ---> System.InvalidOperationException: Splitter/consolidator worker encountered exception while consuming source data ---> System.InvalidOperationException: Splitter/consolidator worker encountered exception while consuming source data ---> System.FormatException: Parsing failed with an exception: Could not parse value 4 in line 800001, column 0 ---> System.InvalidOperationException: Could not parse value 4 in line 800001, column 0 at Microsoft.ML.Data.TextLoader.Parser.ProcessItems(RowSet rows, Int32 irow, Boolean[] active, FieldSet fields, Int32 srcLim, Int64 line) at Microsoft.ML.Data.TextLoader.Parser.ParseRow(RowSet rows, Int32 irow, Helper helper, Boolean[] active, String path, Int64 line, String text) at Microsoft.ML.Data.TextLoader.Cursor.ParallelState.Parse(Int32 tid) at Microsoft.ML.Data.TextLoader.Cursor.ParallelState.ThreadProc(Object obj) --- End of inner exception stack trace --- at Microsoft.ML.Data.TextLoader.Cursor.ParseParallel(ParallelState state)+MoveNext() at Microsoft.ML.Data.TextLoader.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.LinkedRowFilterCursorBase.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.DataViewUtils.Splitter.<>cDisplayClass9_0.b1() --- End of inner exception stack trace --- at Microsoft.ML.Data.DataViewUtils.Splitter.Batch.SetAll(OutPipe[] pipes) at Microsoft.ML.Data.DataViewUtils.Splitter.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.DataViewUtils.Splitter.<>cDisplayClass5_1.b2() --- End of inner exception stack trace --- at Microsoft.ML.Data.DataViewUtils.Splitter.Batch.SetAll(OutPipe[] pipes) at Microsoft.ML.Data.DataViewUtils.Splitter.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.CacheDataView.Filler(DataViewRowCursor cursor, ColumnCache[] caches, OrderedWaiter waiter) --- End of inner exception stack trace --- at Microsoft.ML.Internal.Utilities.OrderedWaiter.Wait(Int64 position, CancellationToken token) at Microsoft.ML.Data.CacheDataView.GetPermutationOrNull(Random rand) at Microsoft.ML.Data.CacheDataView.GetRowCursorWaiterCore[TWaiter](TWaiter waiter, Func2 predicate, Random rand) at Microsoft.ML.Data.CacheDataView.GetRowCursor(IEnumerable1 columnsNeeded, Random rand) at Microsoft.ML.Trainers.TrainingCursorBase.FactoryBase1.Create(Random rand, Int32[] extraCols) at Microsoft.ML.Trainers.OnlineLinearTrainer2.TrainCore(IChannel ch, RoleMappedData data, TrainStateBase state) at Microsoft.ML.Trainers.OnlineLinearTrainer2.TrainModelCore(TrainContext context) at Microsoft.ML.Trainers.TrainerEstimatorBase2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor) at Microsoft.ML.Data.EstimatorChain1.Fit(IDataView input) at Microsoft.ML.AutoML.RunnerUtil.TrainAndScorePipeline[TMetrics](MLContext context, SuggestedPipeline pipeline, IDataView trainData, IDataView validData, String labelColumn, IMetricsAgent1 metricsAgent, ITransformer preprocessorTransform, FileInfo modelFileInfo, DataViewSchema modelInputSchema, AutoMLLogger logger) at Microsoft.ML.Data.TextLoader.Parser.ProcessOne(FieldSet vs, ColInfo info, ColumnPipe v, Int32 irow, Int64 line) [Source=AutoML, Kind=Trace] Evaluating pipeline xf=OneHotEncoding{ col=NO_QUERY:NO_QUERY} xf=TextFeaturizing{ col=Mon_Apr_06_22_19_45_PDT_2009_tf:Mon_Apr_06_22_19_45_PDT_2009} xf=TextFeaturizing{ col=_switchfoothttptwitpic_com_2y1zl_Awwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_doitD_tf:_switchfoot_http_twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D} xf=ColumnConcatenating{ col=Features:NO_QUERY,Mon_Apr_06_22_19_45_PDT_2009_tf,_switchfoot_http___twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D_tf,1467810369} xf=Normalizing{ col=Features:Features} tr=SdcaLogisticRegressionBinary{} cache=+ [Source=AutoML, Kind=Error] Pipeline crashed: xf=OneHotEncoding{ col=NO_QUERY:NO_QUERY} xf=TextFeaturizing{ col=Mon_Apr_06_22_19_45_PDT_2009_tf:Mon_Apr_06_22_19_45_PDT_2009} xf=TextFeaturizing{ col=_switchfoothttptwitpic_com_2y1zl_Awwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_doitD_tf:_switchfoot_http_twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D} xf=ColumnConcatenating{ col=Features:NO_QUERY,Mon_Apr_06_22_19_45_PDT_2009_tf,_switchfoot_http___twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D_tf,1467810369} xf=Normalizing{ col=Features:Features} tr=SdcaLogisticRegressionBinary{} cache=+ . Exception: System.InvalidOperationException: Event we were waiting on was subject to an exception ---> System.InvalidOperationException: Splitter/consolidator worker encountered exception while consuming source data ---> System.InvalidOperationException: Splitter/consolidator worker encountered exception while consuming source data ---> System.FormatException: Parsing failed with an exception: Could not parse value 4 in line 800001, column 0 ---> System.InvalidOperationException: Could not parse value 4 in line 800001, column 0 at Microsoft.ML.Data.TextLoader.Parser.ProcessOne(FieldSet vs, ColInfo info, ColumnPipe v, Int32 irow, Int64 line) at Microsoft.ML.Data.TextLoader.Parser.ParseRow(RowSet rows, Int32 irow, Helper helper, Boolean[] active, String path, Int64 line, String text) at Microsoft.ML.Data.TextLoader.Parser.ProcessItems(RowSet rows, Int32 irow, Boolean[] active, FieldSet fields, Int32 srcLim, Int64 line) at Microsoft.ML.Data.TextLoader.Cursor.ParallelState.Parse(Int32 tid) at Microsoft.ML.Data.TextLoader.Cursor.ParallelState.ThreadProc(Object obj) --- End of inner exception stack trace --- at Microsoft.ML.Data.TextLoader.Cursor.ParseParallel(ParallelState state)+MoveNext() at Microsoft.ML.Data.TextLoader.Cursor.MoveNextCore() at Microsoft.ML.Data.LinkedRowFilterCursorBase.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.DataViewUtils.Splitter.<>cDisplayClass9_0.b1() --- End of inner exception stack trace --- at Microsoft.ML.Data.DataViewUtils.Splitter.Batch.SetAll(OutPipe[] pipes) at Microsoft.ML.Data.DataViewUtils.Splitter.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.DataViewUtils.Splitter.<>cDisplayClass5_1.b2() --- End of inner exception stack trace --- at Microsoft.ML.Data.DataViewUtils.Splitter.Batch.SetAll(OutPipe[] pipes) at Microsoft.ML.Data.DataViewUtils.Splitter.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() --- End of inner exception stack trace --- at Microsoft.ML.Data.CacheDataView.Filler(DataViewRowCursor cursor, ColumnCache[] caches, OrderedWaiter waiter) at Microsoft.ML.Internal.Utilities.OrderedWaiter.Wait(Int64 position, CancellationToken token) at Microsoft.ML.Data.CacheDataView.WaiterWaiter.Wait(Int64 pos) at Microsoft.ML.Trainers.TrainingCursorBase.MoveNext() at Microsoft.ML.Data.CacheDataView.RowCursor1.MoveNext() at Microsoft.ML.Trainers.StochasticTrainerBase2.TrainModelCore(TrainContext context) at Microsoft.ML.Trainers.SdcaTrainerBase3.TrainCore(IChannel ch, RoleMappedData data, LinearModelParameters predictor, Int32 weightSetCount) at Microsoft.ML.Trainers.TrainerEstimatorBase2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor) at Microsoft.ML.Data.EstimatorChain1.Fit(IDataView input) at Microsoft.ML.AutoML.RunnerUtil.TrainAndScorePipeline[TMetrics](MLContext context, SuggestedPipeline pipeline, IDataView trainData, IDataView validData, String labelColumn, IMetricsAgent1 metricsAgent, ITransformer preprocessorTransform, FileInfo modelFileInfo, DataViewSchema modelInputSchema, AutoMLLogger logger) at Microsoft.ML.Data.RootCursorBase.MoveNext() [Source=AutoML, Kind=Trace] 2 NaN 00:02:17.6592338 xf=OneHotEncoding{ col=NO_QUERY:NO_QUERY} xf=TextFeaturizing{ col=Mon_Apr_06_22_19_45_PDT_2009_tf:Mon_Apr_06_22_19_45_PDT_2009} xf=TextFeaturizing{ col=_switchfoothttptwitpic_com_2y1zl_Awwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_doitD_tf:_switchfoot_http_twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D} xf=ColumnConcatenating{ col=Features:NO_QUERY,Mon_Apr_06_22_19_45_PDT_2009_tf,_switchfoot_http___twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D_tf,1467810369} xf=Normalizing{ col=Features:Features} tr=SdcaLogisticRegressionBinary{} cache=+ |2 SdcaLogisticRegressionBinary NaN NaN NaN NaN 137.7 0 | System.InvalidOperationException: Event we were waiting on was subject to an exception ---> System.InvalidOperationException: Splitter/consolidator worker encountered exception while consuming source data ---> System.InvalidOperationException: Splitter/consolidator worker encountered exception while consuming source data ---> System.FormatException: Parsing failed with an exception: Could not parse value 4 in line 800001, column 0 ---> System.InvalidOperationException: Could not parse value 4 in line 800001, column 0 at Microsoft.ML.Data.TextLoader.Parser.ProcessItems(RowSet rows, Int32 irow, Boolean[] active, FieldSet fields, Int32 srcLim, Int64 line) at Microsoft.ML.Data.TextLoader.Cursor.ParallelState.ThreadProc(Object obj) --- End of inner exception stack trace --- at Microsoft.ML.Data.TextLoader.Cursor.ParseParallel(ParallelState state)+MoveNext() at Microsoft.ML.Data.TextLoader.Cursor.MoveNextCore() at Microsoft.ML.Data.TextLoader.Parser.ParseRow(RowSet rows, Int32 irow, Helper helper, Boolean[] active, String path, Int64 line, String text) at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.DataViewUtils.Splitter.<>cDisplayClass9_0.b1() --- End of inner exception stack trace --- at Microsoft.ML.Data.DataViewUtils.Splitter.Batch.SetAll(OutPipe[] pipes) at Microsoft.ML.Data.LinkedRowFilterCursorBase.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.DataViewUtils.Splitter.<>cDisplayClass5_1.b2() --- End of inner exception stack trace --- at Microsoft.ML.Data.DataViewUtils.Splitter.Cursor.MoveNextCore() at Microsoft.ML.Data.DataViewUtils.Splitter.Batch.SetAll(OutPipe[] pipes) at Microsoft.ML.Data.DataViewUtils.Splitter.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.CacheDataView.Filler(DataViewRowCursor cursor, ColumnCache[] caches, OrderedWaiter waiter) --- End of inner exception stack trace --- at Microsoft.ML.Internal.Utilities.OrderedWaiter.Wait(Int64 position, CancellationToken token) at Microsoft.ML.Data.CacheDataView.RowCursor1.MoveNext() at Microsoft.ML.Trainers.TrainingCursorBase.MoveNext() at Microsoft.ML.Trainers.SdcaTrainerBase3.TrainCore(IChannel ch, RoleMappedData data, LinearModelParameters predictor, Int32 weightSetCount) at Microsoft.ML.Trainers.StochasticTrainerBase2.TrainModelCore(TrainContext context) at Microsoft.ML.Data.CacheDataView.WaiterWaiter.Wait(Int64 pos) at Microsoft.ML.Trainers.TrainerEstimatorBase2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor) at Microsoft.ML.Data.EstimatorChain1.Fit(IDataView input) at Microsoft.ML.AutoML.RunnerUtil.TrainAndScorePipeline[TMetrics](MLContext context, SuggestedPipeline pipeline, IDataView trainData, IDataView validData, String labelColumn, IMetricsAgent1 metricsAgent, ITransformer preprocessorTransform, FileInfo modelFileInfo, DataViewSchema modelInputSchema, AutoMLLogger logger) at Microsoft.ML.Data.TextLoader.Cursor.ParallelState.Parse(Int32 tid) at Microsoft.ML.Data.TextLoader.Parser.ProcessOne(FieldSet vs, ColInfo info, ColumnPipe v, Int32 irow, Int64 line) [Source=AutoML, Kind=Trace] Evaluating pipeline xf=OneHotEncoding{ col=NO_QUERY:NO_QUERY} xf=TextFeaturizing{ col=Mon_Apr_06_22_19_45_PDT_2009_tf:Mon_Apr_06_22_19_45_PDT_2009} xf=TextFeaturizing{ col=_switchfoothttptwitpic_com_2y1zl_Awwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_doitD_tf:_switchfoot_http_twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D} xf=ColumnConcatenating{ col=Features:NO_QUERY,Mon_Apr_06_22_19_45_PDT_2009_tf,_switchfoot_http___twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it___D_tf,1467810369} tr=LightGbmBinary{} cache=-

Inferring Columns ... Creating Data loader ... Loading data ... Exploring multiple ML algorithms and settings to find you the best model for ML task: binary-classification For further learning check: https://aka.ms/mlnet-cli | Trainer Accuracy AUC AUPRC F1-score Duration #Iteration | [Source=AutoML, Kind=Trace] Channel started [Source=AutoML, Kind=Trace] Evaluating pipeline xf=OneHotEncoding{ col=NO_QUERY:NO_QUERY} xf=TextFeaturizing{ col=Mon_Apr_06_22_19_45_PDT_2009_tf:Mon_Apr_06_22_19_45_PDT_2009} xf=TextFeaturizing{ col=_switchfoot_http_twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D_tf:_switchfoot_http___twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D} xf=ColumnConcatenating{ col=Features:NO_QUERY,Mon_Apr_06_22_19_45_PDT_2009_tf,_switchfoot_http___twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D_tf,1467810369} xf=Normalizing{ col=Features:Features} tr=AveragedPerceptronBinary{} cache=+ [Source=AutoML, Kind=Error] Pipeline crashed: xf=OneHotEncoding{ col=NO_QUERY:NO_QUERY} xf=TextFeaturizing{ col=Mon_Apr_06_22_19_45_PDT_2009_tf:Mon_Apr_06_22_19_45_PDT_2009} xf=TextFeaturizing{ col=_switchfoothttptwitpic_com_2y1zl_Awwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_doitD_tf:_switchfoot_http_twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D} xf=ColumnConcatenating{ col=Features:NO_QUERY,Mon_Apr_06_22_19_45_PDT_2009_tf,_switchfoot_http___twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D_tf,1467810369} xf=Normalizing{ col=Features:Features} tr=AveragedPerceptronBinary{} cache=+ . Exception: System.InvalidOperationException: Event we were waiting on was subject to an exception ---> System.InvalidOperationException: Splitter/consolidator worker encountered exception while consuming source data ---> System.InvalidOperationException: Splitter/consolidator worker encountered exception while consuming source data ---> System.FormatException: Parsing failed with an exception: Could not parse value 4 in line 800001, column 0 ---> System.InvalidOperationException: Could not parse value 4 in line 800001, column 0 at Microsoft.ML.Data.TextLoader.Parser.ProcessOne(FieldSet vs, ColInfo info, ColumnPipe v, Int32 irow, Int64 line) at Microsoft.ML.Data.TextLoader.Parser.ProcessItems(RowSet rows, Int32 irow, Boolean[] active, FieldSet fields, Int32 srcLim, Int64 line) at Microsoft.ML.Data.TextLoader.Parser.ParseRow(RowSet rows, Int32 irow, Helper helper, Boolean[] active, String path, Int64 line, String text) at Microsoft.ML.Data.TextLoader.Cursor.ParallelState.Parse(Int32 tid) at Microsoft.ML.Data.TextLoader.Cursor.ParallelState.ThreadProc(Object obj) --- End of inner exception stack trace --- at Microsoft.ML.Data.TextLoader.Cursor.ParseParallel(ParallelState state)+MoveNext() at Microsoft.ML.Data.TextLoader.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.LinkedRowFilterCursorBase.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.DataViewUtils.Splitter.<>cDisplayClass9_0.b1() --- End of inner exception stack trace --- at Microsoft.ML.Data.DataViewUtils.Splitter.Batch.SetAll(OutPipe[] pipes) at Microsoft.ML.Data.DataViewUtils.Splitter.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.DataViewUtils.Splitter.<>cDisplayClass5_1.b2() --- End of inner exception stack trace --- at Microsoft.ML.Data.DataViewUtils.Splitter.Batch.SetAll(OutPipe[] pipes) at Microsoft.ML.Data.DataViewUtils.Splitter.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.CacheDataView.Filler(DataViewRowCursor cursor, ColumnCache[] caches, OrderedWaiter waiter) --- End of inner exception stack trace --- at Microsoft.ML.Internal.Utilities.OrderedWaiter.Wait(Int64 position, CancellationToken token) at Microsoft.ML.Data.CacheDataView.GetPermutationOrNull(Random rand) at Microsoft.ML.Data.CacheDataView.GetRowCursorWaiterCore[TWaiter](TWaiter waiter, Func2 predicate, Random rand) at Microsoft.ML.Data.CacheDataView.GetRowCursor(IEnumerable1 columnsNeeded, Random rand) at Microsoft.ML.Trainers.TrainingCursorBase.FactoryBase1.Create(Random rand, Int32[] extraCols) at Microsoft.ML.Trainers.OnlineLinearTrainer2.TrainCore(IChannel ch, RoleMappedData data, TrainStateBase state) at Microsoft.ML.Trainers.OnlineLinearTrainer2.TrainModelCore(TrainContext context) at Microsoft.ML.Trainers.TrainerEstimatorBase2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor) at Microsoft.ML.Data.EstimatorChain1.Fit(IDataView input) at Microsoft.ML.AutoML.RunnerUtil.TrainAndScorePipeline[TMetrics](MLContext context, SuggestedPipeline pipeline, IDataView trainData, IDataView validData, String labelColumn, IMetricsAgent1 metricsAgent, ITransformer preprocessorTransform, FileInfo modelFileInfo, DataViewSchema modelInputSchema, AutoMLLogger logger) [Source=AutoML, Kind=Trace] 1 NaN 00:02:02.7769983 xf=OneHotEncoding{ col=NO_QUERY:NO_QUERY} xf=TextFeaturizing{ col=Mon_Apr_06_22_19_45_PDT_2009_tf:Mon_Apr_06_22_19_45_PDT_2009} xf=TextFeaturizing{ col=_switchfoothttptwitpic_com_2y1zl_Awwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_doitD_tf:_switchfoot_http_twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D} xf=ColumnConcatenating{ col=Features:NO_QUERY,Mon_Apr_06_22_19_45_PDT_2009_tf,_switchfoot_http___twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D_tf,1467810369} xf=Normalizing{ col=Features:Features} tr=AveragedPerceptronBinary{} cache=+ |1 AveragedPerceptronBinary NaN NaN NaN NaN 122.8 0 | System.InvalidOperationException: Event we were waiting on was subject to an exception ---> System.InvalidOperationException: Splitter/consolidator worker encountered exception while consuming source data ---> System.InvalidOperationException: Splitter/consolidator worker encountered exception while consuming source data ---> System.FormatException: Parsing failed with an exception: Could not parse value 4 in line 800001, column 0 ---> System.InvalidOperationException: Could not parse value 4 in line 800001, column 0 at Microsoft.ML.Data.TextLoader.Parser.ProcessOne(FieldSet vs, ColInfo info, ColumnPipe v, Int32 irow, Int64 line) at Microsoft.ML.Data.TextLoader.Parser.ProcessItems(RowSet rows, Int32 irow, Boolean[] active, FieldSet fields, Int32 srcLim, Int64 line) at Microsoft.ML.Data.TextLoader.Parser.ParseRow(RowSet rows, Int32 irow, Helper helper, Boolean[] active, String path, Int64 line, String text) at Microsoft.ML.Data.TextLoader.Cursor.ParallelState.Parse(Int32 tid) at Microsoft.ML.Data.TextLoader.Cursor.ParallelState.ThreadProc(Object obj) --- End of inner exception stack trace --- at Microsoft.ML.Data.TextLoader.Cursor.ParseParallel(ParallelState state)+MoveNext() at Microsoft.ML.Data.TextLoader.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.LinkedRowFilterCursorBase.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.DataViewUtils.Splitter.<>cDisplayClass9_0.b1() --- End of inner exception stack trace --- at Microsoft.ML.Data.DataViewUtils.Splitter.Batch.SetAll(OutPipe[] pipes) at Microsoft.ML.Data.DataViewUtils.Splitter.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.DataViewUtils.Splitter.<>cDisplayClass5_1.b2() --- End of inner exception stack trace --- at Microsoft.ML.Data.DataViewUtils.Splitter.Batch.SetAll(OutPipe[] pipes) at Microsoft.ML.Data.DataViewUtils.Splitter.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.CacheDataView.Filler(DataViewRowCursor cursor, ColumnCache[] caches, OrderedWaiter waiter) --- End of inner exception stack trace --- at Microsoft.ML.Internal.Utilities.OrderedWaiter.Wait(Int64 position, CancellationToken token) at Microsoft.ML.Data.CacheDataView.GetPermutationOrNull(Random rand) at Microsoft.ML.Data.CacheDataView.GetRowCursorWaiterCore[TWaiter](TWaiter waiter, Func2 predicate, Random rand) at Microsoft.ML.Data.CacheDataView.GetRowCursor(IEnumerable1 columnsNeeded, Random rand) at Microsoft.ML.Trainers.TrainingCursorBase.FactoryBase1.Create(Random rand, Int32[] extraCols) at Microsoft.ML.Trainers.OnlineLinearTrainer2.TrainCore(IChannel ch, RoleMappedData data, TrainStateBase state) at Microsoft.ML.Trainers.OnlineLinearTrainer2.TrainModelCore(TrainContext context) at Microsoft.ML.Trainers.TrainerEstimatorBase2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor) at Microsoft.ML.Data.EstimatorChain1.Fit(IDataView input) at Microsoft.ML.AutoML.RunnerUtil.TrainAndScorePipeline[TMetrics](MLContext context, SuggestedPipeline pipeline, IDataView trainData, IDataView validData, String labelColumn, IMetricsAgent1 metricsAgent, ITransformer preprocessorTransform, FileInfo modelFileInfo, DataViewSchema modelInputSchema, AutoMLLogger logger) [Source=AutoML, Kind=Trace] Evaluating pipeline xf=OneHotEncoding{ col=NO_QUERY:NO_QUERY} xf=TextFeaturizing{ col=Mon_Apr_06_22_19_45_PDT_2009_tf:Mon_Apr_06_22_19_45_PDT_2009} xf=TextFeaturizing{ col=_switchfoothttptwitpic_com_2y1zl_Awwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_doitD_tf:_switchfoot_http_twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D} xf=ColumnConcatenating{ col=Features:NO_QUERY,Mon_Apr_06_22_19_45_PDT_2009_tf,_switchfoot_http___twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D_tf,1467810369} xf=Normalizing{ col=Features:Features} tr=SdcaLogisticRegressionBinary{} cache=+ [Source=AutoML, Kind=Error] Pipeline crashed: xf=OneHotEncoding{ col=NO_QUERY:NO_QUERY} xf=TextFeaturizing{ col=Mon_Apr_06_22_19_45_PDT_2009_tf:Mon_Apr_06_22_19_45_PDT_2009} xf=TextFeaturizing{ col=_switchfoothttptwitpic_com_2y1zl_Awwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_doitD_tf:_switchfoot_http_twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D} xf=ColumnConcatenating{ col=Features:NO_QUERY,Mon_Apr_06_22_19_45_PDT_2009_tf,_switchfoot_http___twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D_tf,1467810369} xf=Normalizing{ col=Features:Features} tr=SdcaLogisticRegressionBinary{} cache=+ . Exception: System.InvalidOperationException: Event we were waiting on was subject to an exception ---> System.InvalidOperationException: Splitter/consolidator worker encountered exception while consuming source data ---> System.InvalidOperationException: Splitter/consolidator worker encountered exception while consuming source data ---> System.FormatException: Parsing failed with an exception: Could not parse value 4 in line 800001, column 0 ---> System.InvalidOperationException: Could not parse value 4 in line 800001, column 0 at Microsoft.ML.Data.TextLoader.Parser.ProcessOne(FieldSet vs, ColInfo info, ColumnPipe v, Int32 irow, Int64 line) at Microsoft.ML.Data.TextLoader.Parser.ProcessItems(RowSet rows, Int32 irow, Boolean[] active, FieldSet fields, Int32 srcLim, Int64 line) at Microsoft.ML.Data.TextLoader.Parser.ParseRow(RowSet rows, Int32 irow, Helper helper, Boolean[] active, String path, Int64 line, String text) at Microsoft.ML.Data.TextLoader.Cursor.ParallelState.Parse(Int32 tid) at Microsoft.ML.Data.TextLoader.Cursor.ParallelState.ThreadProc(Object obj) --- End of inner exception stack trace --- at Microsoft.ML.Data.TextLoader.Cursor.ParseParallel(ParallelState state)+MoveNext() at Microsoft.ML.Data.TextLoader.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.DataViewUtils.Splitter.Batch.SetAll(OutPipe[] pipes) at Microsoft.ML.Data.DataViewUtils.Splitter.Cursor.MoveNextCore() at Microsoft.ML.Data.LinkedRowFilterCursorBase.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() --- End of inner exception stack trace --- at Microsoft.ML.Data.DataViewUtils.Splitter.<>cDisplayClass5_1.b2() at Microsoft.ML.Data.DataViewUtils.Splitter.Batch.SetAll(OutPipe[] pipes) at Microsoft.ML.Data.DataViewUtils.Splitter.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.CacheDataView.Filler(DataViewRowCursor cursor, ColumnCache[] caches, OrderedWaiter waiter) --- End of inner exception stack trace --- at Microsoft.ML.Internal.Utilities.OrderedWaiter.Wait(Int64 position, CancellationToken token) at Microsoft.ML.Data.CacheDataView.WaiterWaiter.Wait(Int64 pos) at Microsoft.ML.Data.CacheDataView.RowCursor1.MoveNext() at Microsoft.ML.Trainers.TrainingCursorBase.MoveNext() at Microsoft.ML.Trainers.SdcaTrainerBase3.TrainCore(IChannel ch, RoleMappedData data, LinearModelParameters predictor, Int32 weightSetCount) at Microsoft.ML.Trainers.StochasticTrainerBase2.TrainModelCore(TrainContext context) at Microsoft.ML.Trainers.TrainerEstimatorBase2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor) at Microsoft.ML.Data.EstimatorChain1.Fit(IDataView input) at Microsoft.ML.AutoML.RunnerUtil.TrainAndScorePipeline[TMetrics](MLContext context, SuggestedPipeline pipeline, IDataView trainData, IDataView validData, String labelColumn, IMetricsAgent1 metricsAgent, ITransformer preprocessorTransform, FileInfo modelFileInfo, DataViewSchema modelInputSchema, AutoMLLogger logger) --- End of inner exception stack trace --- at Microsoft.ML.Data.DataViewUtils.Splitter.<>cDisplayClass9_0.b1() at Microsoft.ML.Data.RootCursorBase.MoveNext() [Source=AutoML, Kind=Trace] 2 NaN 00:02:11.1776178 xf=OneHotEncoding{ col=NO_QUERY:NO_QUERY} xf=TextFeaturizing{ col=Mon_Apr_06_22_19_45_PDT_2009_tf:Mon_Apr_06_22_19_45_PDT_2009} xf=TextFeaturizing{ col=_switchfoothttptwitpic_com_2y1zl_Awwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_doitD_tf:_switchfoot_http_twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D} xf=ColumnConcatenating{ col=Features:NO_QUERY,Mon_Apr_06_22_19_45_PDT_2009_tf,_switchfoot_http___twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D_tf,1467810369} xf=Normalizing{ col=Features:Features} tr=SdcaLogisticRegressionBinary{} cache=+ |2 SdcaLogisticRegressionBinary NaN NaN NaN NaN 131.2 0 | System.InvalidOperationException: Event we were waiting on was subject to an exception ---> System.InvalidOperationException: Splitter/consolidator worker encountered exception while consuming source data ---> System.InvalidOperationException: Splitter/consolidator worker encountered exception while consuming source data ---> System.FormatException: Parsing failed with an exception: Could not parse value 4 in line 800001, column 0 ---> System.InvalidOperationException: Could not parse value 4 in line 800001, column 0 at Microsoft.ML.Data.TextLoader.Parser.ProcessOne(FieldSet vs, ColInfo info, ColumnPipe v, Int32 irow, Int64 line) at Microsoft.ML.Data.TextLoader.Parser.ProcessItems(RowSet rows, Int32 irow, Boolean[] active, FieldSet fields, Int32 srcLim, Int64 line) at Microsoft.ML.Data.TextLoader.Parser.ParseRow(RowSet rows, Int32 irow, Helper helper, Boolean[] active, String path, Int64 line, String text) at Microsoft.ML.Data.TextLoader.Cursor.ParallelState.Parse(Int32 tid) at Microsoft.ML.Data.TextLoader.Cursor.ParallelState.ThreadProc(Object obj) --- End of inner exception stack trace --- at Microsoft.ML.Data.TextLoader.Cursor.ParseParallel(ParallelState state)+MoveNext() at Microsoft.ML.Data.TextLoader.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.LinkedRowFilterCursorBase.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.DataViewUtils.Splitter.<>cDisplayClass9_0.b1() --- End of inner exception stack trace --- at Microsoft.ML.Data.DataViewUtils.Splitter.Batch.SetAll(OutPipe[] pipes) at Microsoft.ML.Data.DataViewUtils.Splitter.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.DataViewUtils.Splitter.<>cDisplayClass5_1.b2() --- End of inner exception stack trace --- at Microsoft.ML.Data.DataViewUtils.Splitter.Batch.SetAll(OutPipe[] pipes) at Microsoft.ML.Data.DataViewUtils.Splitter.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Data.CacheDataView.Filler(DataViewRowCursor cursor, ColumnCache[] caches, OrderedWaiter waiter) --- End of inner exception stack trace --- at Microsoft.ML.Internal.Utilities.OrderedWaiter.Wait(Int64 position, CancellationToken token) at Microsoft.ML.Data.CacheDataView.RowCursor1.MoveNext() at Microsoft.ML.Trainers.TrainingCursorBase.MoveNext() at Microsoft.ML.Trainers.StochasticTrainerBase2.TrainModelCore(TrainContext context) at Microsoft.ML.Trainers.SdcaTrainerBase3.TrainCore(IChannel ch, RoleMappedData data, LinearModelParameters predictor, Int32 weightSetCount) at Microsoft.ML.Trainers.TrainerEstimatorBase2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor) at Microsoft.ML.Data.EstimatorChain1.Fit(IDataView input) at Microsoft.ML.AutoML.RunnerUtil.TrainAndScorePipeline[TMetrics](MLContext context, SuggestedPipeline pipeline, IDataView trainData, IDataView validData, String labelColumn, IMetricsAgent1 metricsAgent, ITransformer preprocessorTransform, FileInfo modelFileInfo, DataViewSchema modelInputSchema, AutoMLLogger logger) at Microsoft.ML.Data.CacheDataView.WaiterWaiter.Wait(Int64 pos) [Source=AutoML, Kind=Trace] Evaluating pipeline xf=OneHotEncoding{ col=NO_QUERY:NO_QUERY} xf=TextFeaturizing{ col=Mon_Apr_06_22_19_45_PDT_2009_tf:Mon_Apr_06_22_19_45_PDT_2009} xf=TextFeaturizing{ col=_switchfoothttptwitpic_com_2y1zl_Awwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_doitD_tf:_switchfoot_http_twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D} xf=ColumnConcatenating{ col=Features:NO_QUERY,Mon_Apr_06_22_19_45_PDT_2009_tf,_switchfoot_http___twitpic_com2y1zlAwwwthat_s_abummerYou_shoulda_got_David_Carr_of_Third_Day_to_do_it_D_tf,1467810369} tr=LightGbmBinary{} cache=- Error occured while retreiving best pipeline. Object reference not set to an instance of an object. System.NullReferenceException: Object reference not set to an instance of an object. at Microsoft.ML.CLI.Telemetry.Events.ExperimentCompletedEvent.TrackEvent[TMetrics](RunDetail1 bestRun, List1 allRuns, TaskKind machineLearningTask, TimeSpan duration) at Microsoft.ML.CLI.CodeGenerator.CodeGenerationHelper.GenerateCode() at Microsoft.ML.CLI.Program.<>cDisplayClass1_0.

b__0(NewCommandSettings options) Please see the log file for more info. Exiting ...

harishsk commented 4 years ago

@JakeRadMSFT I don't see this bug repro in the latest master build. I am transferring this issue over for you to resolve and close as you see fit.

LittleLittleCloud commented 4 years ago

@JoeMayo, based on log, it seems that you are using the old mlnet.cli to train your model, which is no longer maintained. The new mlnet.cli doesn't have this issue and will come out soon.

LittleLittleCloud commented 4 years ago

Closing this issue due to no response, if you're still seeing a problem, feel free to reopen it, thanks for reporting!