[docs] Group by clarifications

h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

Apache License 2.0

6.91k stars 2k forks source link

I was trying to reference these docs: https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-munging/groupby.html

I think a few clarifications could be helpful. Some ideas:

Could an example of using the col.names gb.control arg be provided? I could not figure out how to do it. Saw this issue so perhaps it's not working? https://github.com/h2oai/h2o-3/issues/12731
Under the "R only" header it shows nrow being an argument and the description refers to defining column names. That seems like a mistake.
Perhaps na.methods should be presented as a bullet underneath gb.control for clarity.
This note:

If a list smaller than the number of columns groups is supplied, then the list will be padded by ignore.

Could that be clarified? Is it saying the gb.control options are recycled? Could an example be provided where options are by-variable?

h2oai / h2o-3

[docs] Group by clarifications #16338