ropensci / skimr

A frictionless, pipeable approach to dealing with summary statistics
https://docs.ropensci.org/skimr
1.1k stars 78 forks source link

Rename 'skim_variable' to 'variable'? #741

Closed allefeld closed 3 months ago

allefeld commented 3 months ago

I've used skim_with to create my own custom data summary function, selecting summary functions and renaming columns. But there is something I cannot figure out: How do I change the column name 'skim_variable'?

I'm preparing the data summary for an audience not familiar with the technical background, and the 'skim' in 'skim_variable' is just strange. I tried to rename it to 'variable', but then it's not a skim_df anymore, and the nice output in R Markdown disappears.

michaelquinn32 commented 3 months ago

I'm assuming that the you want something different in the default output. If it's in a knitr rendered report, things might get a bit more complicated.

Unfortunately we've built skimr to need specific column names, so it wouldn't be that easy to provide this directly through the existing API.

That said, you can make version of the print function to do this for you. https://colab.research.google.com/drive/1zRY_FcpAF-jxHcjsDYGD-ApU-7GnLUyr?authuser=1#scrollTo=-7lhejz6MKv5

allefeld commented 3 months ago

I see, make a wrapper. And then I guess I have to take care of the HTML rendering for R Markdown myself. Makes sense, thanks!

Just for the record, your code:

rename_and_print <- function(partitioned_df) {
  partitioned_df %>%
    rename(variable = skim_variable) %>%
    print()
}

skimmed <- skim(iris)

summary(skimmed)

caught <- skimmed %>%
  partition() %>%
  map(rename_and_print)
michaelquinn32 commented 3 months ago

Let me know if you need more help with how all of this works within an RMarkdown/ knitr document.