Nice feature however better to have a Event with an EventArgs to manage this in production code

PeterPann23 commented 5 years ago

[Enter feedback here] This is an important feature however there are several different ways one could come to a conclusion to early stop the training.

At the moment the training iteration goes to a fixed number 100 if I'm not mistaken. Although that number is nice it would be better to provide a early stopping rule, providing a rule would in my opinion override/enhance the NumberOfIterations property.

some simple generic rules would help

Minimum improvement
MaxDuration (TimeSpan)
Number of Iterations

Proposed delegate

Iteration Nr => (get)
Maximum Iteration=> (get)
Pref Score/ Improvement =>(get)
Current Score/ Improvement => (get)
- Auto-configured [Trainer].Options specific values related to fitting
StopNow => (get/set)
GetMetrics(validationdata)
GetCurrentModel()

This would allow us to early adjust the training to stop if Iteration/ time passed is not improving as per expected value.

Some use cases:

Speed up training and quality by make better use of trainer options
- understand the auto - discovered properties and how they are adjusted by the trainer so that one can better understands what to specify in the [Trainer].Options like LearningRate etc NumberOfLeaves.
Reporting:
- allows to generate charts showing progress of training (real-time and as log).
- allows joining machine resources with progress in training in reporting/ logging.
In Multi-class:
- Additional train a specific class (under over fitting).
- Store "sub models" for specific classes.
Generative adversarial network (GAN)
- Hook for joining [N] networks together allowing them. to improve the other network in a more efficient way.
Adversarial machine learning (spam filter's, vulnerability testing etc)
- hook for additional training on specific exploits and or make/ save specific models for specific set of exploits/ classes.

The above list is not a complete set of use cases, just some use cases that would be greatly improve usability of the framework that pop in mind.

I know of no way, at the moment, how this can be done with the current framework without massive waist of resources and time. Training with our in-house framework has this and some of my models train for days big server so iterating / poking around (and waiting) with values is not really an option.

When playing with Iris sample size of data this feature might sound silly as it's done before one can sip a cup of coffee, production development is a bit different.

Document Details

⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

ID: 1ee559b0-04b6-5280-ed68-6291d1e2f7cf
Version Independent ID: abe34fa0-63f1-7501-165b-20b55190dc0b
Content: LightGbmTrainerBase<TOptions,TOutput,TTransformer,TModel>.OptionsBase.EarlyStoppingRound Field (Microsoft.ML.Trainers.LightGbm)
Content Source: dotnet/xml/Microsoft.ML.Trainers.LightGbm/LightGbmTrainerBase`4+OptionsBase.xml
Product: dotnet-ml-api
GitHub Login: @sfilipi
Microsoft Alias: johalex

glebuk commented 5 years ago

This is a great suggestion, Peter. We do have most of what you are asking for in FastTree's options. FastTree is a very similar algorithm to LightGBM (both are boosted Decision Trees and share a lot in ideas in common)

Can you give it a shot and see if it does what you need?

PeterPann23 commented 5 years ago

Rules are nice, the delegate however will give control to the calling code, will it get implemented?

I played with FastTree starting release 0.8, but found the results of LightGmb to be better where the release in 1.0 seeming to never finish, somehow takes a lot longer to execute than the preview.

8 commented 3 years ago

That sounds like a really useful proposal!

For example I am currently looking for a way how I could continously plot the accuracy of the model against the training and test set between iterations - this feature would allow that.

Take care, Martin

dotnet / machinelearning

Nice feature however better to have a Event with an EventArgs to manage this in production code #3685

Document Details