Open exalate-issue-sync[bot] opened 1 year ago
Angela Bartz commented: Pull request merged into rel-yule.
Wendy commented: Chris:
I messed up my answer to your question in the PR. Here is my answer again.
Standardizaton is only applied to numerical column types. Enum/binary columns are not affected by standardization.
Hope this is clear.
Wendy
update the transform docs to indicate that it only works with numerical data. See: [https://h2oai.slack.com/archives/C04KNHH2H/p1584389368047200|https://h2oai.slack.com/archives/C04KNHH2H/p1584389368047200]
Original question (March 16, 2020):
When transform = "standardize" for GLRM, it appears that only numeric columns are standardized and categorical/binary columns are skipped (which seems like the right approach): [https://github.com/h2oai/h2o-3/blob/master/h2o-algos/src/main/java/hex/glrm/GLRM.java#L356-L358|https://github.com/h2oai/h2o-3/blob/master/h2o-algos/src/main/java/hex/glrm/GLRM.java#L356-L358|smart-link] and [https://github.com/h2oai/h2o-3/blob/master/h2o-algos/src/main/java/hex/glrm/GLRM.java#L1030.|https://github.com/h2oai/h2o-3/blob/master/h2o-algos/src/main/java/hex/glrm/GLRM.java#L1030.|smart-link] Is that interpretation correct? Either way, is it possible to clarify the treatment of categorical/binary variables in the documentation for transform? [https://github.com//h2oai/h2o-3/blob/master/h2o-docs/src/product/data-science/algo-params/transform.rst|https://github.com//h2oai/h2o-3/blob/master/h2o-docs/src/product/data-science/algo-params/transform.rst]