r-lib / R6

Encapsulated object-oriented programming for R
https://R6.r-lib.org
Other
412 stars 57 forks source link

[Question] How to set active binding at end of a chain? #263

Closed d-sharpe closed 1 year ago

d-sharpe commented 2 years ago

I have a R6 class which is composed of other R6 classes. The parent class has active bindings which return each composing R6 object instance. However if the composing object has an active binding you get an error when trying to set the value of that binding when the active bindings are chained. The value gets updated correctly, but an error is raised. No errors raised if you extract the composing object to a variable first.

Is there a way to allow setting active binding values in chains without throwing this error??

See below for an example.

E.g.

ClassA <-
  R6::R6Class(
    classname = "classA",
    public = list(
      initialize = function(...) {
        private$.classBInstance <- ClassB$new()
      }
    ),
    private = list(.classBInstance = "ClassB"),
    active = list(
      classBObject = function(value) {
        if (missing(value)) {
          return(private$.classBInstance)
        } else {
          stop("Cannot set classB object")
        }
      }
    )
  )

ClassB <-
  R6::R6Class(
    classname = "classB",
    private = list(.value = 0),
    active = list(
      value = function(value) {
        if (missing(value)) {
          private$.value
        } else {
          private$.value <- value
        }
      }
    )
  )

class_a <-
  ClassA$new()

class_a$classBObject ## returns the classBInstance stored in classA

class_a$classBObject <- "new_value" ## throws the expected error detailed in classA active binding "Cannot set classB object"

class_a$classBObject$value ## 0

class_a$classBObject$value <- 1L ## throws an error (the classA active binding error) "Cannot set classB object"

class_a$classBObject$value ## 1L

subObject <- class_a$classBObject

subObject$value <- 2L ## works as expected

class_a$classBObject$value ## 2L
wch commented 2 years ago

The reason you're running into this issue is that subset assignment in R (with $<-, [<-, or [[<-) works differently than how one might expect, coming from other programming languages. Most R objects are immutable, so when you do something like this:

a <- list(x = 1)
a$y <- 2

It doesn't modify the original object that a referred to. Instead, it creates a new list with y in it, and then makes the binding a point to the new list. (Note: there are actually exceptions to this if you look deep under the hood, but I won't get into that here.)

The way this is implemented is:

`*tmp*` <- a
a <- `$<-`(`*tmp*`, "y", value=2)
rm(`*tmp*`)

In the second line, it invokes $<- like a regular function, with parentheses. When you call it this way (instead of inline, as in a$y <- 2, it simply returns the new list; it does not modify the old one. You can see this if you do:

a <- list(x = 1)
`*tmp*` <- a
`$<-`(`*tmp*`, "y", value=2)
a
#> $x
#> [1] 1

The stuff with *tmp* in the previous code block is to make a point to the new list.

(For more information about subset assignment, see https://cran.r-project.org/doc/manuals/r-devel/R-lang.html#Subset-assignment)

This is how R implements a$z <- 3. It doesn't distinguish between lists and environments (and by extension, R6 objects), when it translates the inline assignment into the three separate lines, with *tmp* being created and removed. With environments, the assignment to and from *tmp* is unnecessary and redundant, because the objects are mutable and are be modified in place -- but that's still how it works.

If a is an environment, It still assigns the value from the `$<-`(`*tmp*`, "y", value=2), even though that doesn't change anything. In the example below, I won't even assign e <- `*tmp*`, but you'l see that it is already modified even without that:

e <- as.environment(list(x = 1))
`*tmp*` <- e
`$<-`(`*tmp*`, "y", value=2)
as.list(e)
#> $y
#> [1] 2
#> 
#> $x
#> [1] 1

But again, R still does e <- `*tmp*` at the end, even for environments.


When you chain the subset assignments, like this:

b <- list(x = 1, y = list(z = 2))
b$y$z <- 3

It does something similar, doing assignments to both b$y, and then to b. So that's why you're running into this problem. When you do:

class_a$classBObject$value <- 1L 

It does an assignment like class_a$classBObject <- `*tmp*`, hence the error.


To work around this, you can modify the active binding so that it can do assignment, but only if the new object is identical to the old one. This is the same active binding that you wrote, but with an additional condition checking if identical(value, private$.classBInstance):

    active = list(
      classBObject = function(value) {
        if (missing(value)) {
          return(private$.classBInstance)
        } else if (identical(value, private$.classBInstance)) {
          private$.classBInstance <- value
        } else {
          stop("Cannot set classB object")
        }
      }
    )
d-sharpe commented 2 years ago

Thanks on two counts, first for the quick reply and second for the detailed explanation.

I see that I made an erroneous assumption about how mutable objects (environments) would be updated.

I have quite reasonably extensive nested and inheritance based R6 structure so I was trying to think of a more generic solution than updating every active binding. In your opinion, would creating an abstract base R6 class within my package and have all classes inherit from that, then defining a subset override function for that class e.g.

AbstractClass <-
  R6::R6Class(classname = "PackageAbstractClass",
              public = list(
                initialize = function(...) {
                  if (identical(class(self)[1], "PackageAbstractClass")) {
                    stop("Cannot create an abstract class")
                  }
                }
              ))

ClassA <-
  R6::R6Class(
    classname = "classA",
    inherit = AbstractClass,
    public = list(
      initialize = function(...) {
        super$initialize(...)

        private$.classBInstance <- ClassB$new()
      }
    ),
    private = list(.classBInstance = "ClassB"),
    active = list(
      classBObject = function(value) {
        if (missing(value)) {
          return(private$.classBInstance)
        } else {
          stop("Cannot set classB object")
        }
      }
    )
  )

ClassB <-
  R6::R6Class(
    classname = "classB",
    inherit = AbstractClass,
    public = list(
      initialize = function(...) {
        super$initialize(...)
      }
    ),
    private = list(.value = 0),
    active = list(
      value = function(value) {
        if (missing(value)) {
          private$.value
        } else {
          private$.value <- value
        }
      }
    )
  )

`$<-.PackageAbstractClass` <-
  `[<-.PackageAbstractClass` <- function(x, index, value) {
    if (!missing(value) && !identical(x[[index]], value)) {
      NextMethod()
    }
    return(x)
  }

class_a <-
  ClassA$new()

class_a$classBObject ## returns the classBInstance stored in classA

class_a$classBObject <-
  "new_value" ## throws the expected error detailed in classA active binding "Cannot set classB object"

class_a$classBObject$value ## 0

class_a$classBObject$value <- 1L ## no error, works as expected

class_a$classBObject$value ## 1L

subObject <- class_a$classBObject

subObject$value <- 2L ## works as expected

class_a$classBObject$value ## 2L