dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
8.91k stars 1.86k forks source link

When training with AutoML, I encounter a Win32Exception: The wait operation timed out after 30 seconds. #7166

Closed jackpotcityco closed 4 days ago

jackpotcityco commented 3 weeks ago

CPU: i7-12800h (14 cores, Total Threads: 20) RAM: 32 GB SSD: Samsung 980 Pro, 2 TB Windows Server 2019 Datacenter Evaluation NET Framework: 4.8 Microsoft.ML: 3.0.1 Microsoft.ML.AutoML: 0.21.1

Issue: When I start to train a model using AutoML, I get a timeout after exactly 30 seconds and the application breaks/stops with below error message: (CPU working at 15%. RAM has 50% avaliable memory at the exception)

System.AggregateException: 'One or more errors occurred.' TargetInvocationException: Exception has been thrown by the target of an invocation. SqlException: Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding. Win32Exception: The wait operation timed out

Normally, the training of the Model should continue but I don't understand why I get this timeout after 30 seconds. It seems to be a default value and this must be possible to increase and change but I have not found out where to change this default value or how to solve this problem?

Datasets looks like this: trainData: Rows: 384846, Columns: 64 hold_out_data: Rows: 33958, Columns: 64

Below is the code I use: (You might mention to use "MaxModels" but the root problem is about be able to change the timeout period, I beleive)


        void trainer_function(IDataView trainData, IDataView hold_out_data)
        {
            MLContext mlContext = new MLContext();
            var experiment = mlContext.Auto().CreateBinaryClassificationExperiment(new BinaryExperimentSettings
            {
                MaxExperimentTimeInSeconds = 600,
                CacheBeforeTrainer = CacheBeforeTrainer.On,
                CacheDirectoryName = "C:/Aintelligence/temp/cache",
                MaximumMemoryUsageInMegaByte = 8192,
                OptimizingMetric = BinaryClassificationMetric.PositivePrecision,
                CancellationToken = CancellationToken.None
            });
            var progressHandler = new Progress<RunDetail<BinaryClassificationMetrics>>(ph =>
            {
                if (ph.ValidationMetrics != null && !ph.TrainerName.Contains("FastForest"))
                {
                    double positivePrecision = Math.Round(ph.ValidationMetrics.PositivePrecision, 3); //Do something with: "positivePrecision"
                }
            });
            //Start the experiment/Training
            var results = experiment.Execute(trainData, hold_out_data, labelColumnName: "Label", progressHandler: progressHandler);
        }
LittleLittleCloud commented 3 weeks ago

It seems that the timeout is from SqlException, are you querying dataset from sql server

jackpotcityco commented 3 weeks ago

Yes, you are right. I could see now that the timeout was a command timeout for the query I used in SQL.

I managed to put a higher timeout in seconds so the query had time to finish and now it worked.

Thank you for your help!