r-lib / R6

Encapsulated object-oriented programming for R
https://R6.r-lib.org
Other
405 stars 56 forks source link

[bug?] super behaving unexpectedly with active bindings #107

Closed epicfarmer closed 7 years ago

epicfarmer commented 7 years ago

In the following example class_2 has an active binding which it inherits from class_1. In class_2, the super method on the binding does not work as expected. class_2$foo throws an error when value is missing. I can get around this by checking to see if value is missing (as in class_3), is this the correct way of doing it? I have also tried super$foo(value), but that doesn't work at all.

class_1= R6::R6Class(
    private = list(
        bar = 1
    ),
    active = list(
        foo = function(value){
            if(missing(value)){
                return(private$bar)
            }
            private$bar = value
        }
    )
)
class_2= R6::R6Class(inherit = class_1,
    active = list(
        foo = function(value){
            private$bar = private$bar/2
            super$foo <- value
        }
    )
)
class_3= R6::R6Class(inherit = class_1,
    active = list(
        foo = function(value){
            private$bar = private$bar/2
            if(missing(value)) {return(super$foo)}
            super$foo <- value
        }
    )
)
test_1 <- class_1$new()
test_1$foo <- 3
test_1$foo
test_2 <- class_2$new()
test_2$foo <- 3
test_2$foo
test_3 <- class_3$new()
test_3$foo <- 3
test_3$foo
wch commented 7 years ago

I think this may actually a be a bug in R. It seems to be treating the expression a$b <- c differently from the equivalent verbose form, `$<-`(a, "b", c), when c is a missing value that's passed along from another function call.

EDIT: As Kevin pointed out below, a$b <- c is actually equivalent to `<-`(a$b, c), and the two do indeed have the same behavior. Knowing this, the code examples below aren't relevant because g uses the wrong expression.

Here's an example:

# =========================================================
# First, create a $<-.foo method. All it does is tell us if a value
# was passed in, or if it was missing. It doesn't modify the object.
`$<-.foo` <- function(x, name, value) {
  if (missing(value))
    cat("value is missing\n")
  else
    cat(paste0("value is ", value, "\n"))

  x
}

# Create a foo object, which we'll use later
foo_obj <- structure(list(), class = "foo")

# =========================================================
# f() simply assigns a value to x$a
f <- function (x, value) {
  x$a <- value
}

# Call it with a value: OK
f(foo_obj, 2)
#> value is 2

# Call it without a value: Error
f(foo_obj)
#> Error in f(foo_obj) : argument "value" is missing, with no default

# =========================================================
# g is just like f, except that it uses the more explicit call 
g <- function (x, value) {
  `$<-`(x, "a", value)
}

# Call it with a value: OK
g(foo_obj, 2)
#> value is 2
#> list()
#> attr(,"class")
#> [1] "foo"

# Call it with a value: OK (!) 
# It invokes `<-.foo` with a missing `value`
g(foo_obj)
#> value is missing
#> list()
#> attr(,"class")
#> [1] "foo"

It looks like there are two differences: One is that the verbose `$<-`(a, "b", c) form passes along the missing state of value. The second is that the verbose form also prints the result. I thought that the concise and verbose forms of the expression were perfectly equivalent.

@hadley, @jimhester, @kevinushey Do you guys know anything about why these are behaving differently?

kevinushey commented 7 years ago

Some debugging seems to indicate that R evaluates something like this:

`<-`(x$a, value)

And the assignment operator forces the promise value, thereby causing the missingness error. (When the promise is successfully forced, <- then dispatches to $<-.foo, to handle the call appropriately.) All in all, R is forcing value behind the scenes. See https://github.com/wch/r-source/blob/871172dfa2798ea401be4982c85a4653a0144de8/src/main/eval.c#L2536-L2541 for the associated code.

FWIW here's the stack trace I see when the error is triggered:

(lldb) bt
* thread #1: tid = 0x11151c, 0x0000000101d30422 libR.dylib`Rf_error(format="argument \"%s\" is missing, with no default") + 98 at errors.c:793, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x0000000101d30422 libR.dylib`Rf_error(format="argument \"%s\" is missing, with no default") + 98 at errors.c:793 [opt]
    frame #1: 0x0000000101d47a04 libR.dylib`Rf_eval(e=<unavailable>, rho=<unavailable>) + 11460 at eval.c:640 [opt]
  * frame #2: 0x0000000101da82f7 libR.dylib`applydefine(call=0x000061d0023fc2b0, op=<unavailable>, args=0x000061d0023fc278, rho=0x000061d0023f8640) + 471 at eval.c:2066 [opt]
    frame #3: 0x0000000101d456af libR.dylib`Rf_eval(e=<unavailable>, rho=0x000061d0023f8640) + 2415 at eval.c:695 [opt]
    frame #4: 0x0000000101da636b libR.dylib`do_begin(call=0x000061d0023fc2e8, op=0x000061d0000089c0, args=0x000061d0023fc3c8, rho=0x000061d0023f8640) + 1051 at eval.c:1835 [opt]
    frame #5: 0x0000000101d456af libR.dylib`Rf_eval(e=<unavailable>, rho=0x000061d0023f8640) + 2415 at eval.c:695 [opt]
    frame #6: 0x0000000101d9a4d5 libR.dylib`Rf_applyClosure(call=<unavailable>, op=<unavailable>, arglist=<unavailable>, rho=<unavailable>, suppliedvars=<unavailable>) + 5621 at eval.c:1160 [opt]
    frame #7: 0x0000000101d45d17 libR.dylib`Rf_eval(e=<unavailable>, rho=0x000061d00003d7d0) + 4055 at eval.c:742 [opt]
    frame #8: 0x0000000101e439f0 libR.dylib`Rf_ReplIteration(rho=0x000061d00003d7d0, savestack=0, browselevel=0, state=<unavailable>) + 1184 at main.c:258 [opt]
    frame #9: 0x0000000101e48281 libR.dylib`R_ReplConsole(rho=0x000061d00003d7d0, savestack=0, browselevel=0) + 433 at main.c:308 [opt]
    frame #10: 0x0000000101e48088 libR.dylib`run_Rmainloop + 152 at main.c:1059 [opt]
    frame #11: 0x0000000100000ea5 R`main(ac=<unavailable>, av=<unavailable>) + 53 at Rmain.c:29 [opt]
    frame #12: 0x00007fff9b2fc255 libdyld.dylib`start + 1

With frame 2's contents:

frame #2: 0x0000000101da82f7 libR.dylib`applydefine(call=0x000061d0023fc2b0, op=<unavailable>, args=0x000061d0023fc278, rho=0x000061d0023f8640) + 471 at eval.c:2066 [opt]
   2063         assignment is right associative i.e.  a <- b <- c is parsed as
   2064         a <- (b <- c).  */
   2065
-> 2066     PROTECT(saverhs = rhs = eval(CADR(args), rho));
   2067     INCREMENT_REFCNT(saverhs);
   2068
   2069     /*  FIXME: We need to ensure that this works for hashed
wch commented 7 years ago

@kevinushey Thanks for looking into this! The explanation makes sense, looking at the code that you pointed to.

It's not clear to me, though, why the RHS needs to be evaluated first. If <- were a regular R function, it wouldn't need to be -- if you have f(a, f(b, c)), the f(b, c) would not be evaluated first. Maybe it has something to do with <- being a primitive function.

epicfarmer commented 7 years ago

@wch , as an update, the workaround described above also produces some weird (potentially unrelated behavior). If in addition to the above code,

test_4 = test_3$clone(TRUE)
test_4$foo

the binding detaches during the clone somehow. Assuming the above language problem will not be changed, is it possible to fix the binding issue?

Also, should this be another issue?

wch commented 7 years ago

@epicfarmer Yes, please file another issue.