Open PeterPann23 opened 5 years ago
This is a great suggestion, Peter. We do have most of what you are asking for in FastTree's options. FastTree is a very similar algorithm to LightGBM (both are boosted Decision Trees and share a lot in ideas in common)
Can you give it a shot and see if it does what you need?
Rules are nice, the delegate however will give control to the calling code, will it get implemented?
I played with FastTree starting release 0.8, but found the results of LightGmb to be better where the release in 1.0 seeming to never finish, somehow takes a lot longer to execute than the preview.
That sounds like a really useful proposal!
For example I am currently looking for a way how I could continously plot the accuracy of the model against the training and test set between iterations - this feature would allow that.
Take care, Martin
[Enter feedback here] This is an important feature however there are several different ways one could come to a conclusion to early stop the training.
At the moment the training iteration goes to a fixed number 100 if I'm not mistaken. Although that number is nice it would be better to provide a early stopping rule, providing a rule would in my opinion override/enhance the NumberOfIterations property.
some simple generic rules would help
Proposed delegate
This would allow us to early adjust the training to stop if Iteration/ time passed is not improving as per expected value.
Some use cases:
Speed up training and quality by make better use of trainer options
Reporting:
In Multi-class:
Generative adversarial network (GAN)
Adversarial machine learning (spam filter's, vulnerability testing etc)
The above list is not a complete set of use cases, just some use cases that would be greatly improve usability of the framework that pop in mind.
I know of no way, at the moment, how this can be done with the current framework without massive waist of resources and time. Training with our in-house framework has this and some of my models train for days big server so iterating / poking around (and waiting) with values is not really an option.
When playing with Iris sample size of data this feature might sound silly as it's done before one can sip a cup of coffee, production development is a bit different.
Document Details
⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.