egnha / valaddin

Functional input validation to make R functions more readable and robust
Other
33 stars 1 forks source link

Input-validation environment must encompass promises and predicates #32

Closed egnha closed 7 years ago

egnha commented 7 years ago

Here's an example of a problem that arises in the current implementation of valaddin when the function environment is not on the search path of the temporary environment created in which inputs are validated (the input-validation environment):

# This simulates the creation of a function in a package
foo <- (function() {
  internal <- "internal"
  function(x = internal) x
})()
parent.env(environment(foo)) <- baseenv()

foo()
#> [1] "internal"

valaddin::firmly(foo, ~is.character)()
#> Error: (valaddin::firmly(foo, ~is.character))()
#> Error evaluating check is.character(x): object 'internal' not found

The result should have been the same as foo(). Why couldn't firmly find the object internal?

The reason is that internal is inaccessible from the context of input validation. The enclosure of the input-validation environment is the environment of the formula ~is.character, which is the global environment. However, the search path of the global environment intersects the search path of foo at the base environment, which is the enclosing environment of the environment that binds internal. This is why firmly can't find the object internal.

With that understanding, the problem can be resolved with the following construction:

NB: Because of R's lexical scoping rules for promises with default value vs promises with caller-supplied value, it is necessary that the enclosing environment of the promise environment be the environment of the function.

Mockup of refined input-validation procedure:

chk <- ~is.character

# Temporary environment in which to validate inputs
ve <- new.env(parent = emptyenv()) 
# ve has a single binding (the environment of promises), randomly named
.PROMENV <- sprintf("__PROM.ENV__%.12f", runif(1L))
ve[[.PROMENV]] <- (function(x = internal) environment())()
parent.env(ve[[.PROMENV]]) <- environment(foo)

# Evaluating the input check in ve now works
expr <- substitute(lazyeval::f_eval(chk)(get("x", ENV), list(ENV = as.name(.PROMENV)))
# Vary the enclosing environment of `ve` according to the check formula
parent.env(ve) <- lazyeval::f_env(chk)
eval(expr, ve)
#> [1] TRUE

In the actual procedure, the verbose eval line would be the result of simple substitution, which would transform the expression

quote(lazyeval::f_eval(chk)(x))

to the expression of the form

quote((lazyeval::f_eval(chk))(get("x", `__PROM.ENV__0.805840510409`)))

i.e., substitution on cdr of the call expression. (NB. It is important to use get rather than $ to access environment bindings, otherwise missing promises would return (empty) symbols.)

Remark: The proposed fix works "almost surely," but is not guaranteed to work unconditionally, for the name of the promise environment in ve—the value of the randomly generated string .PROMENV—could, in theory, clash with a name in the lexical scope of chk. The probability of this happening accidentally is practically nil. (A more complex validation mechanism could guarantee zero name collisions, at the expense of performance and a considerably more complicated parsing-substitution procedure.)