h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.91k stars 2k forks source link

[docs] Group by clarifications #16338

Closed hutch3232 closed 4 days ago

hutch3232 commented 3 months ago

I was trying to reference these docs: https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-munging/groupby.html

I think a few clarifications could be helpful. Some ideas:

  1. Could an example of using the col.names gb.control arg be provided? I could not figure out how to do it. Saw this issue so perhaps it's not working? https://github.com/h2oai/h2o-3/issues/12731
  2. Under the "R only" header it shows nrow being an argument and the description refers to defining column names. That seems like a mistake.
  3. Perhaps na.methods should be presented as a bullet underneath gb.control for clarity.
  4. This note:

If a list smaller than the number of columns groups is supplied, then the list will be padded by ignore.

Could that be clarified? Is it saying the gb.control options are recycled? Could an example be provided where options are by-variable?

wendycwong commented 3 months ago

@hutch3232

An example code here:

image

So, sorry. Will fix. May take a while.

hannah-tillman commented 4 days ago

closed with: https://github.com/h2oai/h2o-3/pull/16404