shim common functions - Githubissues

beanumber commented 6 years ago

Input functions from tidyverse (read_csv(), etc.) and read.csv()
Output functions (write_csv(), write.csv(), etc.)
library() and require() (@hadley)
source()

Do a handful of these manually, and then consider a function factory or some kind of tidy evaluation.

Log the paths and calls to a CSV file (i.e., .fertile.csv)

In interactive mode, throw an error if the path is bad.

Have a sliding scale of strictness

beanumber commented 6 years ago

you could also mine https://github.com/hadley/strict for ideas/implementations

beanumber commented 6 years ago

setwd() should probably throw an error (in the strict case)

beanumber commented 6 years ago

ggplot2::save()?

hadley commented 6 years ago

ggsave()?

beanumber commented 6 years ago

Yes -- my bad.

beanumber commented 6 years ago

TODO: use a function factory to create a whole slew of functions at once.

beanumber commented 6 years ago

https://adv-r.hadley.nz/function-factories.html

beanumber commented 6 years ago

https://adv-r.hadley.nz/function-factories.html#another-approach

beanumber commented 6 years ago

https://adv-r.hadley.nz/function-factories.html#moving-a-list-to-the-global-environment

beanumber commented 6 years ago

Alright, @hadley @jennybc the first step is to admit you have a problem. Mine is that I still don't understand tidyeval.

I'm trying to do this:

#' Function factory

shim_log_input <- function(.f) {
  rlang::new_function(
    rlang::exprs(file = ),
    rlang::expr({
      log_push(file, !!.f)
      check_path(file)
      .f(file)
    }),
    rlang::caller_env()
  )
}

funs_to_shim <- c(
  utils::read.csv,
  readr::read_csv
)

shims <- purrr::map(funs_to_shim, shim_log_input)
shims

in log_push(file, !!.f), I want a string with the name of the function. !!.f is returning the function itself (which I sort of understand), but how do I get the name of the function as a string?
in .f(file), I want to actually execute the function. But if I put !!.f(file) here, it seems to be running when the functions are created by map, which of course errors. That is, when I am simply trying to create the shimmed function for read.csv(), it seems to actually be trying to execute read.csv(), which fails.
and then we have to deal with the ....
and then we have to name the shimmed functions, and attach() them (!)
and then we have to write the documentation. Can this be automated?
and then we have to write a second function factory for output functions like write.csv() that require an object and a path? And another one for function like library() that don't involve paths at all?
and then what happens when the argument names vary (e.g., file vs. filename)?

Is this the right approach?

jennybc commented 6 years ago

I played around with some of your subtasks separately here, e.g. recovering the name of .f, looking at its formals, wrapping a function, wrapping a list of functions. It's not all wired up together, but maybe this provides some ideas you can build on until @hadley swoops 🙂. .f cannot ever know its own name -- you have to recover that by quoting the input.

library(rlang)

function_report <- function(.f) {
  .f_quoted <- enquo(.f)
  list(
    .f_quoted,
    .f_name = expr_text(quo_get_expr(.f_quoted)),
    formal_names = fn_fmls_names(.f)
  )
}

function_report(read.csv)
#> [[1]]
#> <quosure>
#>   expr: ^read.csv
#>   env:  global
#> 
#> $.f_name
#> [1] "read.csv"
#> 
#> $formal_names
#> [1] "file"         "header"       "sep"          "quote"       
#> [5] "dec"          "fill"         "comment.char" "..."

# forgive this horrible name, I'm not feeling creative
foo <- function(.f) {
  function(...) {
    l <- list(...)
    cat("file = ", l[["file"]], "\n")
    .f(...)
  }
}

read.csv(file = system.file("extdata/mtcars.csv", package = "readr"), nrows = 3)
#>    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> 3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1

read.csv_wrapped <- foo(read.csv)
read.csv_wrapped(
  file = system.file("extdata/mtcars.csv", package = "readr"),
  nrows = 3
)
#> file =  /Users/jenny/R/library/3.5/readr/extdata/mtcars.csv
#>    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> 3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1

funs_to_wrap <- c(read.csv = utils::read.csv, read_csv = readr::read_csv)
wraps <- lapply(funs_to_wrap, foo)
wraps
#> $read.csv
#> function (...) 
#> {
#>     l <- list(...)
#>     cat("file = ", l[["file"]], "\n")
#>     .f(...)
#> }
#> <bytecode: 0x7fe1d44fa2d8>
#> <environment: 0x7fe1d43a39e0>
#> 
#> $read_csv
#> function (...) 
#> {
#>     l <- list(...)
#>     cat("file = ", l[["file"]], "\n")
#>     .f(...)
#> }
#> <bytecode: 0x7fe1d44fa2d8>
#> <environment: 0x7fe1d45008f8>

wraps$read_csv(
  file = system.file("extdata/mtcars.csv", package = "readr"),
  n_max = 3
)
#> file =  /Users/jenny/R/library/3.5/readr/extdata/mtcars.csv
#> Parsed with column specification:
#> cols(
#>   mpg = col_double(),
#>   cyl = col_integer(),
#>   disp = col_integer(),
#>   hp = col_integer(),
#>   drat = col_double(),
#>   wt = col_double(),
#>   qsec = col_double(),
#>   vs = col_integer(),
#>   am = col_integer(),
#>   gear = col_integer(),
#>   carb = col_integer()
#> )
#> # A tibble: 3 x 11
#>     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>   <dbl> <int> <int> <int> <dbl> <dbl> <dbl> <int> <int> <int> <int>
#> 1  21       6   160   110  3.9   2.62  16.5     0     1     4     4
#> 2  21       6   160   110  3.9   2.88  17.0     0     1     4     4
#> 3  22.8     4   108    93  3.85  2.32  18.6     1     1     4     1

^{Created on 2018-10-15 by the reprex package (v0.2.1)}

hadley commented 6 years ago

I don't have time to explain anything right now, but hopefully this gets you slightly unblocked:

library(rlang)
shim_log_input <- function(.f) {
  rlang::new_function(
    rlang::exprs(file = ),
    rlang::expr({
      log_push(file, !!(deparse(.f)))
      check_path(file)
      (!!.f)(file)
    }),
    rlang::caller_env()
  )
}

funs_to_shim <- exprs(
  utils::read.csv,
  readr::read_csv
)

shims <- purrr::map(funs_to_shim, shim_log_input)
shims
#> [[1]]
#> function (file) 
#> {
#>     log_push(file, "utils::read.csv")
#>     check_path(file)
#>     utils::read.csv(file)
#> }
#> <environment: 0x7ffd4b301378>
#> 
#> [[2]]
#> function (file) 
#> {
#>     log_push(file, "readr::read_csv")
#>     check_path(file)
#>     readr::read_csv(file)
#> }
#> <environment: 0x7ffd4b301378>

(note that funs_to_shim now uses exprs())

beanumber commented 6 years ago

Thank you both for the very helpful responses. I think I got it to work in ec7dea12.

I'm still not quite sure what to do about the varying argument names.

jennybc commented 6 years ago

Your shimming function will presumably have to take the input function and the name of the argument you want to give special treatment, possibly defaulting to file (?).

beanumber commented 6 years ago

On second thought, @hadley advises doing these individually.

baumer-lab / fertile

shim common functions #3