Open justinormont opened 3 years ago
Thanks @justinormont !! I can work on adding a new file for this as part of the repo clean up
@luisquintanilla Is there anywhere in Docs we could add this to as well?
@briacht @luisquintanilla Hope y'all don't mind me chiming in but would this doc be what you need? https://docs.microsoft.com/en-us/dotnet/machine-learning/resources/tasks
@jwood803: Looks like an appropriate place. This may also work: https://docs.microsoft.com/en-us/dotnet/machine-learning/how-to-choose-an-ml-net-algorithm
Currently, there is not a unified page to go to to see a summary of what each trainer is good at, and its features/limitations.
Here's a start which I made with @glebuk a while back:
The cells without information (empty cells) are simply not filled in. Specifically, it's a missing answer.
In Excel form: ML.NET Trainer Cheatsheet.xlsx
Table in Markdown form
Notes: [1] k-means needs normalization, but you can also up weight certain features by giving them a larger scale [2] ova & pkpd are streamable if the underlying trainers are [3] most binary trainers handle multi-class classification tasks when used with OVA or PKPD [4] naive bayes stops reading at 2B rows (is fixed?) [5] naive bayes does better on numeric data with MeanVar w/ fixZero=false, or whitening (both move the mean to zero; though will densify ngrams/categorical which is very slow)