dotnet / machinelearning-modelbuilder

Simple UI tool to build custom machine learning models.
Creative Commons Attribution 4.0 International
265 stars 56 forks source link

Training time finished without any models trained. #ML.NET #638

Open dipeshtare opened 4 years ago

dipeshtare commented 4 years ago

Problem encountered on https://dotnet.microsoft.com/learn/ml-dotnet/get-started-tutorial/train Operating System: windows

Provide details about the problem you're experiencing. Include your operating system version, exact error message, code sample, and anything else that is relevant. at Microsoft.ML.ModelBuilder.AutoMLService.Experiments.AutoMLExperiment`3.d23.MoveNext() in E:\A_work\1326\s\Microsoft.ML.ModelBuilder.AutoMLService\Experiments\AutoMLExperiment.cs:line 103 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.ML.ModelBuilder.AutoMLEngine.d28.MoveNext() in E:\A_work\1326\s\Microsoft.ML.ModelBuilder.AutoMLService\AutoMLEngineService\AutoMLEngine.cs:line 134

ML.NET

JakeRadMSFT commented 4 years ago

Hello, Sorry you're hitting this error.

pokusnik commented 4 years ago

Hello, Sorry you're hitting this error.

How big is your dataset? How long did you train for? What scenario type did you choose?

Hi, unfortunately I have the same problem: Visual Studio 16.5.2, ML.NET Model Builder 16.0.2003.302

input,result 1,10 2,20 3,30 4,40 5,50 6,60

slobo80 commented 4 years ago

I have the same problem. Q: How big is your dataset? A: 30 rows. 8 features Q: How long did you train for? A: 10 seconds Q: What scenario type did you choose? A: price prediction

Visual Studio information

Microsoft Visual Studio Enterprise 2019 Int Preview Version 16.6.0 Preview 3.0 [30005.37.master] VisualStudio.16.IntPreview/16.6.0-pre.3.0+30005.37.master Microsoft .NET Framework Version 4.8.03752

Installed Version: Enterprise

Visual C++ 2019 00433-90050-52975-AA754 Microsoft Visual C++ 2019

ASP.NET and Web Tools 2019 16.6.822.59114 ASP.NET and Web Tools 2019

ASP.NET Core Razor Language Services 16.1.0.2018108+d23b03d665e5a8ca639ce3f3e22bc9e8c921d686 Provides languages services for ASP.NET Core Razor.

ASP.NET Web Frameworks and Tools 2012 16.6.822.59114 For additional information, visit https://www.asp.net/

ASP.NET Web Frameworks and Tools 2019 16.6.822.59114 For additional information, visit https://www.asp.net/

Azure App Service Tools v3.0.0 16.6.822.59114 Azure App Service Tools v3.0.0

Azure Functions and Web Jobs Tools 16.6.822.59114 Azure Functions and Web Jobs Tools

C# Tools 3.6.0-3.20201.9+8ee2960308721bf49b4f496da46d80cbbc1afb80 C# components used in the IDE. Depending on your project type and settings, a different version of the compiler may be used.

Chutzpah Context Menu 4.4.8 Enables you to run JavaScript unit tests via Chutzpah from the context menu

Common Azure Tools 1.10 Provides common services for use by Azure Mobile Services and Microsoft Azure Tools.

IntelliCode Extension 1.0 IntelliCode Visual Studio Extension Detailed Info

Microsoft Azure Tools 2.9 Microsoft Azure Tools for Microsoft Visual Studio 2019 - v2.9.30212.1

Microsoft Continuous Delivery Tools for Visual Studio 0.4 Simplifying the configuration of Azure DevOps pipelines from within the Visual Studio IDE.

Microsoft JVM Debugger 1.0 Provides support for connecting the Visual Studio debugger to JDWP compatible Java Virtual Machines

Microsoft Library Manager 2.1.50+g25aae5a24a.R Install client-side libraries easily to any web project

Microsoft MI-Based Debugger 1.0 Provides support for connecting Visual Studio to MI compatible debuggers

Microsoft Visual C++ Wizards 1.0 Microsoft Visual C++ Wizards

Microsoft Visual Studio Tools for Containers 1.1 Develop, run, validate your ASP.NET Core applications in the target environment. F5 your application directly into a container with debugging, or CTRL + F5 to edit & refresh your app without having to rebuild the container.

Microsoft Visual Studio VC Package 1.0 Microsoft Visual Studio VC Package

Node.js Tools 1.5.20317.1 Commit Hash:3e70368beb9630c811076c051f4c9a59b45d7c10 Adds support for developing and debugging Node.js apps in Visual Studio

NuGet Package Manager 5.6.0 NuGet Package Manager in Visual Studio. For more information about NuGet, visit https://docs.nuget.org/

ProjectServicesPackage Extension 1.0 ProjectServicesPackage Visual Studio Extension Detailed Info

Snapshot Debugging Extension 1.0 Snapshot Debugging Visual Studio Extension Detailed Info

SQL Server Data Tools 16.0.62003.20160 Microsoft SQL Server Data Tools

TypeScript Tools 16.0.20325.2001 TypeScript Tools for Microsoft Visual Studio

Visual Basic Tools 3.6.0-3.20201.9+8ee2960308721bf49b4f496da46d80cbbc1afb80 Visual Basic components used in the IDE. Depending on your project type and settings, a different version of the compiler may be used.

Visual F# Tools 10.9.1.0 for F# 4.7 16.6.0-beta.20175.3+0ecd7e82fac473898b1749b774630d2e09269f5b Microsoft Visual F# Tools 10.9.1.0 for F# 4.7

Visual Studio Code Debug Adapter Host Package 1.0 Interop layer for hosting Visual Studio Code debug adapters in Visual Studio

Visual Studio Container Tools Extensions (Preview) 1.0 View, manage, and diagnose containers within Visual Studio.

Visual Studio Tools for Containers 1.0 Visual Studio Tools for Containers

System Info OS Name Microsoft Windows 10 Enterprise Version 10.0.18363 Build 18363 Other OS Description Not Available OS Manufacturer Microsoft Corporation System Name SLOBOSURFACE System Manufacturer Microsoft Corporation System Model Surface Laptop 2 System Type x64-based PC System SKU Surface_Laptop_2_1769_Consumer Processor Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz, 2112 Mhz, 4 Core(s), 8 Logical Processor(s) BIOS Version/Date Microsoft Corporation 137.2706.768, 4/18/2019 SMBIOS Version 3.1 Embedded Controller Version 255.255 BIOS Mode UEFI BaseBoard Manufacturer Microsoft Corporation BaseBoard Product Surface Laptop 2 BaseBoard Version Not Available Platform Role Mobile Secure Boot State On PCR7 Configuration Elevation Required to View Windows Directory C:\WINDOWS System Directory C:\WINDOWS\system32 Boot Device \Device\HarddiskVolume2 Locale United States Hardware Abstraction Layer Version = "10.0.18362.628" User Name
Time Zone Pacific Daylight Time Installed Physical Memory (RAM) 16.0 GB Total Physical Memory 15.9 GB Available Physical Memory 4.96 GB Total Virtual Memory 20.4 GB Available Virtual Memory 5.20 GB Page File Space 4.50 GB Page File C:\pagefile.sys Kernel DMA Protection Not Available Virtualization-based security Running Virtualization-based security Required Security Properties Base Virtualization Support, Secure Boot Virtualization-based security Available Security Properties Base Virtualization Support, Secure Boot, DMA Protection, UEFI Code Readonly, SMM Security Mitigations 1.0, Mode Based Execution Control Virtualization-based security Services Configured Credential Guard Virtualization-based security Services Running Credential Guard Device Encryption Support Elevation Required to View A hypervisor has been detected. Features required for Hyper-V will not be displayed.

beccamc commented 4 years ago

Hi @pokusnik and @slobo80. Thanks for taking the time to report. The problem here is likely that the datasets you're using are too small. 30 rows of data is considered a small sample for ML.NET and is difficult to train against (ML.Net will split the set into 80% training and 20% testing).

Adding more data should allow the model to finish training. Can you share what you're trying to accomplish with your data (or share the dataset)? If you just want a sample for testing we have price prediction (regression) data available in this tutorial.

slobo80 commented 4 years ago

What would be large enough dataset?

beccamc commented 4 years ago

@justinormont Can you provide some guidance on minimum file sizes? Thanks!

justinormont commented 4 years ago

Limitation from AutoML code

From the AutoML side, the smallest size which will run is pretty small. I'm not sure the minimum dataset size has been explicitly tested previously.

Quick test gave an answer of:

Just because it runs, doesn't mean it's a useful model. For that, you would like additional rows. This mostly depends on the difficulty of the problem you're trying to solve.

Using the AutoML․NET CLI to test: (datasets SyntheticTinyDatasets.zip)

Note the switched label column name for classification and regression.

AutoML bug for binary classification

Error one:

Exception occured while exploring pipelines:
Training failed with the exception: System.ArgumentOutOfRangeException: AUC is not defined when there is no positive class in the data
 Parameter name: PosSample
   at Microsoft.ML.Data.EvaluatorBase`1.AucAggregatorBase`1.ComputeWeightedAuc(Double& unweighted)
   at Microsoft.ML.Data.BinaryClassifierEvaluator.Aggregator.Finish()
   at Microsoft.ML.Data.BinaryClassifierEvaluator.<>c__DisplayClass32_0.<GetAggregatorConsolidationFuncs>b__0(UInt32 stratColKey, ReadOnlyMemory`1 stratColVal, Aggregator agg)
   at Microsoft.ML.Data.EvaluatorBase`1.ProcessData(IDataView data, RoleMappedSchema schema, Func`2 activeColsIndices, TAgg aggregator, AggregatorDictionaryBase[] dictionaries)
   at Microsoft.ML.Data.EvaluatorBase`1.Microsoft.ML.Data.IEvaluator.Evaluate(RoleMappedData data)
   at Microsoft.ML.Data.BinaryClassifierEvaluator.Evaluate(IDataView data, String label, String score, String predictedLabel)
   at Microsoft.ML.AutoML.RunnerUtil.TrainAndScorePipeline[TMetrics](MLContext context, SuggestedPipeline pipeline, IDataView trainData, IDataView validData, String labelColumn, IMetricsAgent`1 metricsAgent, ITransformer preprocessorTransform, FileInfo modelFileInfo, DataViewSchema modelInputSchema, AutoMLLogger logger) 

Error two:

Exception occured while exploring pipelines:
Training failed with the exception: System.ArgumentNullException: Value cannot be null.
Parameter name: items
   at System.Collections.Immutable.Requires.FailArgumentNullException(String parameterName)
   at System.Collections.Immutable.ImmutableArray.Create[T](T[] items, Int32 start, Int32 length)
   at Microsoft.ML.Trainers.FastTree.RegressionTreeBase..ctor(InternalRegressionTree tree)
   at Microsoft.ML.Trainers.FastTree.TreeEnsembleModelParametersBasedOnRegressionTree.<>c.<CreateTreeEnsembleFromInternalDataStructure>b__5_0(InternalRegressionTree tree)
   at System.Linq.Enumerable.SelectListIterator`2.ToList()
   at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
   at Microsoft.ML.Trainers.FastTree.TreeEnsemble`1..ctor(IEnumerable`1 trees, IEnumerable`1 treeWeights, Double bias)
   at Microsoft.ML.Trainers.FastTree.TreeEnsembleModelParametersBasedOnRegressionTree.CreateTreeEnsembleFromInternalDataStructure()
   at Microsoft.ML.Trainers.LightGbm.LightGbmBinaryTrainer.CreatePredictor()
   at Microsoft.ML.Trainers.LightGbm.LightGbmTrainerBase`4.TrainModelCore(TrainContext context)
   at Microsoft.ML.Trainers.TrainerEstimatorBase`2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor)
   at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input)
   at Microsoft.ML.AutoML.RunnerUtil.TrainAndScorePipeline[TMetrics](MLContext context, SuggestedPipeline pipeline, IDataView trainData, IDataView validData, String labelColumn, IMetricsAgent`1 metricsAgent, ITransformer preprocessorTransform, FileInfo modelFileInfo, DataViewSchema modelInputSchema, AutoMLLogger logger)

Both of these errors from the ML․NET model training/evaluation should be caught by AutoML, and the sweeping process should continue. Specifically, the bug in AutoML is that these are not being caught. Fixing this bug in AutoML will reduce the number of needed rows.

Quick investigation didn't reveal the cause. I was thinking that we had a try-catch around the model training but not evaluate; wasn't the case. I'm not sure why the existing try-catch fails to catch the error.

Limitation from Model Builder

The user reported errors above imply that Model Builder has a higher requirement for number of rows than the AutoML code.

The error reported above points to MoveNext() in AutoMLExperiment.cs:line 103:

at Microsoft.ML.ModelBuilder.AutoMLService.Experiments.AutoMLExperiment`3.d__23.MoveNext() in E:\A_work\1326\s\Microsoft.ML.ModelBuilder.AutoMLService\Experiments\AutoMLExperiment.cs:line 103

I'd recommend investigating the Model Builder bug as well.

beccamc commented 4 years ago

@justinormont Thank you for the detailed analysis! I'll check out the Model Builder bug.

@slobo80 For the number of rows, per Justin's advice above "Just because it runs, doesn't mean it's a useful model. For that, you would like additional rows. This mostly depends on the difficulty of the problem you're trying to solve."

However, you want a model with the rows you have. Also based on Justin's advice above I've done some more testing and for your dataset you probably need to up the runtime significantly. Justin used 60 seconds. Model Builder has some additional wrappers not included in the CLI. I used a dummy dataset with 20 lines and it took ~120 seconds in Model Builder. You started with 10 seconds, can you try running for significantly longer?

beccamc commented 4 years ago

I've updated the error message to be more specific for small datasets. We're also adding more information to our docs.

The stack trace above looks like the rpc exception trace. It's most likely the timeout exception thrown from the service.

@slobo80 Did you get your dataset working?

slobo80 commented 4 years ago

@beccamc , since I dont have a larger dataset I could not get it to work.

beccamc commented 4 years ago

@Slobo80 how long did you train? Did you try 120 seconds?

Also, I've found for some datasets cannot create a model because they are actually forecasting problems (predicting future values based on historical values) instead of regression (predict a numeric value based on a set of related features). We don't support forecasting in Model Builder yet.

justinormont commented 4 years ago

@slobo80 : With 30 rows & 8 features, your dataset is quite small. What's the use case?

--

Also, I've found for some datasets cannot create a model because they are actually forecasting problems (predicting future values based on historical values) instead of regression (predict a numeric value based on a set of related features). We don't support forecasting in Model Builder yet.

Predicting the future is pretty common in regression (and ML in general), though your features should be indicative of that future value.

For instance, your label could be tomorrow's number of Uber rides and your features could be { average number of rides in the last week/month/year, weather prediction for tomorrow, is holiday, day of week, month of the year }.

You would want to split your dataset on time to avoid leakage (oldest in train, newer in validate, newest in test). Forecasting would automate this for you (rolling origin cross-validation, featurization including calculating rolling averages) and insure it's done properly. Forecasting also has additional specialization like working directly on the univariate time-series data (which @beccamc may be alluding to), handling of multiple grains (e.g. predicting each city's Uber rides), and inbuilt ability to predict at multiple time periods in the future.

slobo80 commented 4 years ago

@beccamc I tried 120s and got the same error. @justinormont The use case is stock price data with 8 features since the lock down started ~March 10.

justinormont commented 4 years ago

@slobo80 : I wouldn't expect that the AutoML will do well on that problem due to size and the general unpredictability of the stock market.

Want to post your data? Then I can try running it in the CLI.

While it will run correctly in the AutoML․NET CLI, you won't be able to run the dataset correctly in Model Builder since it doesn't allow you to provide a separate validation and test dataset (and will not automatically split based on time). There is not a way in Model Builder to not leak the time-series information.

As quoted from the Wikipedia article on leakage: (disclosure: self-quoting, so not a secondary source)

Row-wise leakage leakage is caused by improper sharing of information between rows of data. ... Time leakage (e.g. splitting a time-series dataset randomly instead of newer data in test set using a TrainTest split or rolling-origin cross validation)

Though Model Builder can not run the dataset correctly, it should still run to completion without errors. If you post your dataset perhaps @beccamc et al. can investigate with Model Builder.

beccamc commented 4 years ago

@slobo80 Do you want to share your data so I can see why model builder is failing?

beccamc commented 4 years ago

Closing this due to inactivity. Please ping me or reopen if this is still an issue. Thanks!

Datakda commented 3 years ago

I have the same problem. I cannot teach. I am using data https://github.com/dotnet/machinelearning-samples/blob/master/datasets/taxi-fare-train.csv And I get the error

beccamc commented 3 years ago

@Datakda You might need to increase the training time. Try giving a few minutes and see if you can complete the training.

80LevelElf commented 10 months ago

Hi everyone! We have the same troubles with "Training time finished without completing a successful trial. Either no trial completed or the metric for all completed trials are NaN or Infinity"

Are there any updates on ticket? It looks like very critical

Maybe is there some workaround?

veronikaria commented 5 months ago

I'm trying to train the model from learning: sentiment labelled sentences.zip

Keep getting the same error. Are there any updates on this issue?

Model Builder Error Training time finished without any models trained.

at Microsoft.ML.ModelBuilder.AutoMLService.Experiments.LocalAutoMLExperiment.d15.MoveNext() in /_/src/Microsoft.ML.ModelBuilder.AutoMLService/Experiments/LocalAutoMLExperiment.cs:line 170 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.ML.ModelBuilder.AutoMLEngine.d_21.MoveNext() in //src/Microsoft.ML.ModelBuilder.AutoMLService/AutoMLEngineService/AutoMLEngine.cs:line 167 at StreamJsonRpc.JsonRpc.d143`1.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.ML.ModelBuilder.ViewModels.TrainViewModel.<b__109_0>d.MoveNext()