dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
9.02k stars 1.88k forks source link

[ML.net cli] Resume a cli training after a crash #6286

Open wil70 opened 2 years ago

wil70 commented 2 years ago

Hello, is there a way to resume an ML.net cli training to where it was before a crash? I have a lot of data in the folder C:\Users\wwww\AppData\Local\Temp\AutoML-NNI\Experiment-9K67B4 but I do not know how to make mlnet start from there.

Detail: I used the cli, ie "mlnet classicfiaction...." I trained for a few days, but I made a mistake which used a lot of memory on my computer, which stopped the mlnet process. I would like to start mlnet to where it left so it can continue from there

Thanks w

2022-08-10 15:03:24.3091 DEBUG System.InvalidOperationException: Event we were waiting on was subject to an exception ---> System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown. at System.Array.Resize[T](T[]& array, Int32 newSize) at Microsoft.ML.Internal.Utilities.ArrayUtils.EnsureSize[T](T[]& array, Int32 min, Int32 max, Boolean keepOld, Boolean& resized) at Microsoft.ML.Data.CacheDataView.ColumnCache.ImplOne`1.CacheCurrent() at Microsoft.ML.Data.CacheDataView.Filler(DataViewRowCursor cursor, ColumnCache[] caches, OrderedWaiter waiter)

dakersnar commented 2 years ago

I don't believe this functionality is currently supported.

cc @luisquintanilla for confirmation. Is this something that we are tracking support for in the future?