Open exalate-issue-sync[bot] opened 1 year ago
Megan Kurka commented: [~accountid:5b153fb1b0d76456f36daced] just to clarify this Jira is just to research if it is worth it to optimized categorical_encoding parameter in AutoML, not to actually implement it.
To me, research would mean checking how much optimizing this parameter improves performance for a variety of datasets and seeing if that performance improvement outweighs the added time it may take to run.
Please let me know if anything is unclear for this jira.
Erin LeDell commented: [~accountid:557058:f0137791-c6cb-47bd-bcce-fc81ad4cfefa] Here’s what we can do:
Make a fork of the code with categorical_encoding (all possible values) added to the existing grid searches that support it (GBM, DNN, XGBoost).
JIRA Issue Migration Info
Jira Issue: PUBDEV-7495 Assignee: Sebastien Poirier Reporter: Megan Kurka State: Open Fix Version: N/A Attachments: N/A Development PRs: N/A
Research if categorical_encoding should be a parameter that is optimized in AutoML. I have found that sometimes setting this to a value other than AUTO improves results: https://github.com/h2oai/h2o-tutorials/blob/master/best-practices/categorical-predictors/gbm_drf.ipynb