ropensci / skimr

A frictionless, pipeable approach to dealing with summary statistics
https://docs.ropensci.org/skimr
1.11k stars 79 forks source link

Improve nested usage by allowing "data_name" to be overridden as an arg. #614

Closed jw5 closed 3 years ago

jw5 commented 3 years ago

With a small backward compatible change it would be possible to invoke skim from within other functions and pass the original variable name from a caller higher up the call stack rather than using the temporary name internal to the calling function.

For example

myFunc <- function(x) skim(x)

myFunct(myData)

Shows the name of myData as "x", rather than the more desirable "myData".

However, a small change to the current code:

skim <- function (data, ...) 
{
    data_name <- rlang::expr_label(substitute(data))
    if (!inherits(data, "data.frame")) {
        data <- as.data.frame(data)
    }
...

could be changed to this:

skim <- function (data, data_name = rlang::expr_label(substitute(data)), ...) 
{
    if (!inherits(data, "data.frame")) {
        data <- as.data.frame(data)
    }
...

would allow

myFunc <- function(x) skim(x, data_name = deparse(substitute(x)))

myFunct(myData)

To display the original variable name "myData" in the skim output.

Thanks, Jim

elinw commented 3 years ago

That's very nice! @michaelquinn32 do you agree?

michaelquinn32 commented 3 years ago

This works for me! Would you mind opening a pull request so that we can merge this into the code?

elinw commented 3 years ago

I am going to mention that the reason we do that currently is so we can display the data name in the print by making it an attributedf_name . We could also create a convenience function to return that.

elinw commented 3 years ago

@jw5 If you would send us a PR, please also add yourself as a contributor in the description file.

elinw commented 3 years ago

@jw5 @michaelquinn32 I think it would make sense to get this into the next release, do you agree?

michaelquinn32 commented 3 years ago

I agree. This is a great addition. Thanks!

elinw commented 3 years ago

Interestingly it also has the benefit of removing the `` (backticks) around the data set name.

It looks to me as though this breaks tidy select. I'll push up a branch with failing tests after I look at it some more.

It's in the dataname branch.