tidyverse / dplyr

dplyr: A grammar of data manipulation
https://dplyr.tidyverse.org/
Other
4.79k stars 2.12k forks source link

Standard grouped summary #7085

Closed LukasTang closed 2 months ago

LukasTang commented 2 months ago

I feel like a small convenience function is missing that enables users to compute some summary statistics given a grouped data frame for selected variables.

The function I propose, standard_summary does just that, and it works as depicted in the screen.

image

Basically, there are two input arguments: df is a data frame that can either be grouped or not, funs is a list of functions to compute, there is a certain standard set-up that I personally find most useful for my purposes , but happy to discuss. Some tests are defined to ensure that the right output is produced for the starboards dataset as well.

DavisVaughan commented 2 months ago

I appreciate the idea and PR, but this kind of feature belongs in an extension package that builds on top of dplyr.

For example, the skimr package implements things like this already https://github.com/ropensci/skimr

In the future, it is typically best if you open an issue first before doing a PR for a new feature. This avoids you doing extra work for a feature that we unfortunately can't accept! See https://code-review.tidyverse.org/issues/#sec-issue-first-pr-second for more details.

Thank you!

LukasTang commented 2 months ago

Thanks for the feedback Davis, I had a look at skimr and it does seem to be a nice package that might have slipped my attention (Need to check if my function is worth another try in that package or if the package suffices) . It was interesting for me anyhow because it was my first time trying to contribute something and I am learning by wrongdoing :)