baumer-lab / fertile

creating optimal conditions for reproducibility
GNU General Public License v3.0
52 stars 4 forks source link

shim common functions #3

Closed beanumber closed 4 years ago

beanumber commented 6 years ago

Do a handful of these manually, and then consider a function factory or some kind of tidy evaluation.

Log the paths and calls to a CSV file (i.e., .fertile.csv)

In interactive mode, throw an error if the path is bad.

Have a sliding scale of strictness

beanumber commented 6 years ago

you could also mine https://github.com/hadley/strict for ideas/implementations

beanumber commented 6 years ago

setwd() should probably throw an error (in the strict case)

beanumber commented 6 years ago

ggplot2::save()?

hadley commented 6 years ago

ggsave()?

beanumber commented 6 years ago

Yes -- my bad.

beanumber commented 6 years ago

TODO: use a function factory to create a whole slew of functions at once.

beanumber commented 6 years ago

https://adv-r.hadley.nz/function-factories.html

beanumber commented 6 years ago

https://adv-r.hadley.nz/function-factories.html#another-approach

beanumber commented 6 years ago

https://adv-r.hadley.nz/function-factories.html#moving-a-list-to-the-global-environment

beanumber commented 6 years ago

Alright, @hadley @jennybc the first step is to admit you have a problem. Mine is that I still don't understand tidyeval.

I'm trying to do this:

#' Function factory

shim_log_input <- function(.f) {
  rlang::new_function(
    rlang::exprs(file = ),
    rlang::expr({
      log_push(file, !!.f)
      check_path(file)
      .f(file)
    }),
    rlang::caller_env()
  )
}

funs_to_shim <- c(
  utils::read.csv,
  readr::read_csv
)

shims <- purrr::map(funs_to_shim, shim_log_input)
shims

Is this the right approach?

jennybc commented 6 years ago

I played around with some of your subtasks separately here, e.g. recovering the name of .f, looking at its formals, wrapping a function, wrapping a list of functions. It's not all wired up together, but maybe this provides some ideas you can build on until @hadley swoops 🙂. .f cannot ever know its own name -- you have to recover that by quoting the input.

library(rlang)

function_report <- function(.f) {
  .f_quoted <- enquo(.f)
  list(
    .f_quoted,
    .f_name = expr_text(quo_get_expr(.f_quoted)),
    formal_names = fn_fmls_names(.f)
  )
}

function_report(read.csv)
#> [[1]]
#> <quosure>
#>   expr: ^read.csv
#>   env:  global
#> 
#> $.f_name
#> [1] "read.csv"
#> 
#> $formal_names
#> [1] "file"         "header"       "sep"          "quote"       
#> [5] "dec"          "fill"         "comment.char" "..."

# forgive this horrible name, I'm not feeling creative
foo <- function(.f) {
  function(...) {
    l <- list(...)
    cat("file = ", l[["file"]], "\n")
    .f(...)
  }
}

read.csv(file = system.file("extdata/mtcars.csv", package = "readr"), nrows = 3)
#>    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> 3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1

read.csv_wrapped <- foo(read.csv)
read.csv_wrapped(
  file = system.file("extdata/mtcars.csv", package = "readr"),
  nrows = 3
)
#> file =  /Users/jenny/R/library/3.5/readr/extdata/mtcars.csv
#>    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> 3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1

funs_to_wrap <- c(read.csv = utils::read.csv, read_csv = readr::read_csv)
wraps <- lapply(funs_to_wrap, foo)
wraps
#> $read.csv
#> function (...) 
#> {
#>     l <- list(...)
#>     cat("file = ", l[["file"]], "\n")
#>     .f(...)
#> }
#> <bytecode: 0x7fe1d44fa2d8>
#> <environment: 0x7fe1d43a39e0>
#> 
#> $read_csv
#> function (...) 
#> {
#>     l <- list(...)
#>     cat("file = ", l[["file"]], "\n")
#>     .f(...)
#> }
#> <bytecode: 0x7fe1d44fa2d8>
#> <environment: 0x7fe1d45008f8>

wraps$read_csv(
  file = system.file("extdata/mtcars.csv", package = "readr"),
  n_max = 3
)
#> file =  /Users/jenny/R/library/3.5/readr/extdata/mtcars.csv
#> Parsed with column specification:
#> cols(
#>   mpg = col_double(),
#>   cyl = col_integer(),
#>   disp = col_integer(),
#>   hp = col_integer(),
#>   drat = col_double(),
#>   wt = col_double(),
#>   qsec = col_double(),
#>   vs = col_integer(),
#>   am = col_integer(),
#>   gear = col_integer(),
#>   carb = col_integer()
#> )
#> # A tibble: 3 x 11
#>     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>   <dbl> <int> <int> <int> <dbl> <dbl> <dbl> <int> <int> <int> <int>
#> 1  21       6   160   110  3.9   2.62  16.5     0     1     4     4
#> 2  21       6   160   110  3.9   2.88  17.0     0     1     4     4
#> 3  22.8     4   108    93  3.85  2.32  18.6     1     1     4     1

Created on 2018-10-15 by the reprex package (v0.2.1)

hadley commented 6 years ago

I don't have time to explain anything right now, but hopefully this gets you slightly unblocked:

library(rlang)
shim_log_input <- function(.f) {
  rlang::new_function(
    rlang::exprs(file = ),
    rlang::expr({
      log_push(file, !!(deparse(.f)))
      check_path(file)
      (!!.f)(file)
    }),
    rlang::caller_env()
  )
}

funs_to_shim <- exprs(
  utils::read.csv,
  readr::read_csv
)

shims <- purrr::map(funs_to_shim, shim_log_input)
shims
#> [[1]]
#> function (file) 
#> {
#>     log_push(file, "utils::read.csv")
#>     check_path(file)
#>     utils::read.csv(file)
#> }
#> <environment: 0x7ffd4b301378>
#> 
#> [[2]]
#> function (file) 
#> {
#>     log_push(file, "readr::read_csv")
#>     check_path(file)
#>     readr::read_csv(file)
#> }
#> <environment: 0x7ffd4b301378>

(note that funs_to_shim now uses exprs())

beanumber commented 6 years ago

Thank you both for the very helpful responses. I think I got it to work in ec7dea12.

I'm still not quite sure what to do about the varying argument names.

jennybc commented 6 years ago

Your shimming function will presumably have to take the input function and the name of the argument you want to give special treatment, possibly defaulting to file (?).

beanumber commented 6 years ago

On second thought, @hadley advises doing these individually.