r-lib / R6

Encapsulated object-oriented programming for R
https://R6.r-lib.org
Other
407 stars 56 forks source link

Using closure in active binding #80

Closed bernardkaplan closed 8 years ago

bernardkaplan commented 8 years ago

I'm trying to make use of closures in active binding, to simplify the coding and avoid duplications. However I encounter an issue as R doesn't seem to resolve the variable in the closure environment as I would expect. Forcing name substitution with the unenclose method from pryr package gives me the expected result but I'd like to find a more elegant solution.

Here follows a small piece of code to illustrate my point. First I set a base class Base with a single private field.

library(R6)

Base <- R6Class("Base",
  public = list(
    initialize = function(x) {
      private$x <- x
    }
  ),
  private = list(
    x = NA
  )
)

The class A defines 2 active bindings, similar except for an exponent value.

A <- R6Class("A", inherit = Base,
  active = list(
    Square = function() private$x ^ 2,
    Cube = function() private$x ^ 3
  )
)

a <- A$new(5)
a$Square # returns 25
a$Cube # returns 125

Now, using a closure could reduce the amount of code and, more importantly, avoid duplications.

power <- function(exp) {
  function() private$x ^ exp
}

B <- R6Class("B", inherit = Base,
  active = list(
    A = power(2),
    B = power(3)
  )
)

b <- B$new(5)
b$A # Returns an error
b$B # Returns an error

However, R complains about this code and returns an error (translated from french).

Error in private$x^exp : non-numeric argument for a binary operator

The only way I could think of to circumvent this issue is to call the unenclose method from pryr.

library(pryr)
power <- function(exp) {
  f <- function() private$x ^ exp
  unenclose(f)
}

B <- R6Class("B", inherit = Base,
  active = list(
    A = power(2),
    B = power(3)
  )
)

b <- B$new(5)
b$A # returns 25
b$B # returns 125

Could you think of a better, more elegant way to tackle this issue?

wch commented 8 years ago

As you've probably figured out, the reason the code doesn't work is because the parent environment for methods (and active bindings) gets reassigned when the object is instantiated, and so it loses the exp stored in the parent environment.

In terms of saving memory, I wouldn't worry about that. When a function is duplicated and assigned a new environment, the body is not copied -- for the body, both functions point to the same memory location. For example:

f <- function() {
  1+1
}

# Copy function and assign new enviornment
g <- f
environment(g) <- new.env()

The formals and body memory addresses are the same, but they have a different enclosing environment:

.Internal(inspect(f))
# @15e5430a0 03 CLOSXP g0c0 [NAM(2),ATT] 
# FORMALS:
#   @102800d78 00 NILSXP g1c0 [MARK] 
# BODY:
#   @15e542718 06 LANGSXP g0c0 [ATT] 
# ....
# CLOENV:
#   @102855358 04 ENVSXP g1c0 [MARK,NAM(2),GL,gp=0x8000] <R_GlobalEnv>
# ....

.Internal(inspect(g))
# @15e543638 03 CLOSXP g0c0 [NAM(2),ATT]
# FORMALS:
#   @102800d78 00 NILSXP g1c0 [MARK,NAM(2)]
# BODY:
#   @15e542718 06 LANGSXP g0c0 [ATT]
# ....
# CLOENV:
#   @15e5436e0 04 ENVSXP g0c0 [NAM(1)] <0x15e5436e0>
# ....

So I don't think you'd save any memory by having smaller function bodies -- they'll still be copied and consume the same incremental memory.

schloerke commented 8 years ago

Is there a major motivation why the encapsulating fn env is stomped? Would it possible to wrap each fn in an env and then reassign the 'wrapper' when the object is initialized?

Being able to use closures would be greatly appreciated to reduce code reuse or preprocess information.

I know this example can be worked around, but in general it will be fairly difficult.

TakesTime <- R6Class("TakesTime",

  public = list(
    get_fib_ans = function() {

      fib <- function(n) {
        if (n < 2) return(1)
        fib(n - 1) + fib(n - 2)
      }

      ans <- fib(28)
      ans
    }
  )
)

tt <- TakesTime$new()
tt$get_fib_ans() # calculates
tt$get_fib_ans() # calculates
tt$get_fib_ans() # calculates
tt$get_fib_ans() # calculates

NoTime <- R6Class("TakesTime",
  public = list(
    get_fib_ans = (function() {
      fib <- function(n) {
        if (n < 2) return(1)
        fib(n - 1) + fib(n - 2)
      }

      ans <- fib(28)

      function() {
        ans
      }
    })()
  )
)

nt <- NoTime$new() # calculates once
nt$get_fib_ans() # should be instant, but wont work
# Error in nt$get_fib_ans() : object 'ans' not found
wch commented 8 years ago

@schloerke The reason that the env is changed is because it's fundamental to how R6 objects work; it's how private and self can point to the right objects.

I don't think there's a way to wrap environments and make it work: first, it's unsafe in R to reassign an environment's parent, and second, even if you did, a closure might rely on it's parent's parent (or parent's parent's parent) for some values.

In an example like the one above, if it did work the way you want, then the R6 objects would be sharing ans across all of them, which would add another layer of complexity.

The only way I can see to make something like this work is to modify the body of function, using substitute() or something similar (like the example above which uses pryr::unenclose).

schloerke commented 8 years ago

Makes sense. Thank you.

pryr::unenclose was new to me and was having trouble figuring out where and how many times to call it.

Thank you for the clarification!