KentonWhite / ProjectTemplate

A template utility for R projects that provides a skeletal project.
http://projecttemplate.net
GNU General Public License v3.0
623 stars 159 forks source link

Feature request: Allow to pass custom arguments to the reader functions #236

Open Hugovdberg opened 6 years ago

Hugovdberg commented 6 years ago

Report an Issue / Request a Feature

I'm submitting a (Check one with "x") :


Issue Severity Classification -
Expected Behavior

The arguments to a specific reader are available through the file.reader.

Current Behavior

All readers take the path to the data file and the variable to store the data as arguments. When a specific file needs special treatment there is no way to override the default options.

Possible Solution

A few steps are necessary to implement this:

Related issues:
KentonWhite commented 6 years ago

There are some custom readers in the wild. Changing the signature in this way may cause some trouble with the custom readers. Could setMethod be used to preserve the original signature?

Hugovdberg commented 6 years ago

I'm thinking we could perhaps do some inspection of the arguments in add.extension using the formals function. If an old style reader is passed it can raise a warning and wrap the reader in a closure with the correct arguments. I haven't tried it yet, but we could do some interface changes as we move to version 1.0.

KentonWhite commented 6 years ago

That sounds too cool!

Hugovdberg commented 6 years ago

A small proof of concept:

# New style reader
test_reader_new_style <- function(file.name, variable.name, ...) {
  return(list(file.name = file.name, 
              variable.name = variable.name, 
              '...' = list(...)))
}
# Old style reader
test_reader_old_style <- function(data.file, file.name, variable.name) {
  return(list(data.file = data.file,
              file.name = file.name, 
              variable.name = variable.name))
}
# Old style reader nr2
test_reader_old_style_2 <- function(data.file, file.name, variable.name) {
  return(list(data.file.2 = data.file,
              file.name.2 = file.name, 
              variable.name.2 = variable.name))
}

# ProjectTemplate:::extensions.dispatch.table
extension_dispatch_env <- new.env()
# ProjectTemplate::.add_extension
test_add_extension <- function(extension, reader_function) {
  if (!identical(names(formals(reader_function)), 
                 c('file.name', 'variable.name', '...'))) {
    warning('A reader with non-standard arguments detected.\n',
            'Assuming old style arguments, for now will be wrapped in a helper function.\n',
            'In a future version this will become an error.')
    # Old style reader, define a function with the new style arguments that
    # calls the function that was passed in. Because this happens in a closure
    # the reference to reader_function will still exist outside once
    # test_add_extension finishes.
    # Can't reuse the old name because that causes a recursion error.
    func_store <- function(file.name, variable.name, ...) {
      reader_function(basename(file.name), file.name, variable.name)
    }
  } else {
    # New style reader, store reference temporarily (see branch above)
    func_store <- reader_function
  }
  # Store the, possibly wrapped, function in the environment
  extension_dispatch_env[[extension]] <- func_store
}

# Set warnings to immediate so we can see the timing when the warning is issued
opts <- options('warn' = 1)
# Add the new reader, should not issue warning
print('Add new')
test_add_extension('a', test_reader_new_style)
# Add the old reader, should issue warning
print('Add old')
test_add_extension('b', test_reader_old_style)
# Add the second old reader, should issue warning
print('Add old 2')
test_add_extension('c', test_reader_old_style_2)
# Call the new reader to check the received arguments
print('New:')
str(extension_dispatch_env$a('data/test.txt', 'test_var', encoding = 'utf-8'))
# Call the old reader to check the received arguments
print('Old:')
str(extension_dispatch_env$b('data/test.txt', 'test_var', encoding = 'utf-8'))
# Call the old reader to check the received arguments
print('Old2:')
str(extension_dispatch_env$c('data/test.txt', 'test_var', encoding = 'utf-8'))
# Reset the warning options
options(opts)

Output:

[1] "Add new"
[1] "Add old"
Warning in test_add_extension("b", test_reader_old_style) :
  A reader with non-standard arguments detected.
Assuming old style arguments, for now will be wrapped in a helper function.
In a future version this will become an error.
[1] "Add old 2"
Warning in test_add_extension("c", test_reader_old_style_2) :
  A reader with non-standard arguments detected.
Assuming old style arguments, for now will be wrapped in a helper function.
In a future version this will become an error.
[1] "New:"
List of 3
 $ file.name    : chr "data/test.txt"
 $ variable.name: chr "test_var"
 $ ...          :List of 1
  ..$ encoding: chr "utf-8"
[1] "Old:"
List of 3
 $ data.file    : chr "test.txt"
 $ file.name    : chr "data/test.txt"
 $ variable.name: chr "test_var"
[1] "Old2:"
List of 3
 $ data.file.2    : chr "test.txt"
 $ file.name.2    : chr "data/test.txt"
 $ variable.name.2: chr "test_var"
Hugovdberg commented 6 years ago

Ok, so after adding some explanation, the proof of concept grew a little bigger. The main point of adding two old functions is to show that references to the functions are correctly stored in the closures.

KentonWhite commented 6 years ago

Cool! And the idea is that we flag it as deprecated for now. We should also suggest that the user contact the reader maintainer to fix the interface.