Open jpepg4 opened 3 years ago
The missing trainers should be available for a user to enable, but most of the missing should be disabled by default.
The default search space for AutoML is meant to be efficient. GAM for instance, is both very slow and will rarely win compared to FastTree or LightGBM. GAM is designed for explainability, so a user may want to enable it manually, perhaps along with linear models, for use cases where explainability is key.
LDSVM is an exception, it was added to ML․NET after AutoML's search space was created. LDSVM would be beneficial to enable by default, as it can do well on text/ngrams.
List of current binary trainers used in AutoML: https://github.com/dotnet/machinelearning/blob/5dbfd8acac0bf798957eea122f1413209cdf07dc/src/Microsoft.ML.AutoML/API/BinaryClassificationExperiment.cs#L88-L137
Notes: OVA-LDSVM should be added to multiclass AutoML also. When adding a new trainer, be sure it works in the CodeGen.
@justinormont so the action items from this is to add the OVA-LDSVM to auto ml and test it. Correct? Any others that should be added as well?
@justinormont so the action items from this is to add the OVA-LDSVM to auto ml and test it. Correct? Any others that should be added as well?
Should add, and have on
by default:
Should add, and have off
by default:
Adding the rest, and having them default off
, keeps our search space efficient, while allowing the user to manually enable those trainers. For instance, if they want to optimize GAM (which is not an available choice currently).
System information
AutoML 0.17.4
Issue
AutoML BinaryClassificationExperiment is currently missing support for several binary classification trainers (LDSVM, GAM, etc). In particular adding LDSVM would be helpful as it has a large number of parameters and it isn't really overlapped with other existing trainers.