r-lib / R6

Encapsulated object-oriented programming for R
https://R6.r-lib.org
Other
403 stars 56 forks source link

R6 should raise an error for unexistent member values, not return NULL #210

Closed stefanoborini closed 3 years ago

stefanoborini commented 3 years ago

The following code should raise an error

> x <- R6::R6Class("x", list(foo = function() { print(self$notexistent) }))
> xx <- x$new()
> xx$foo()
NULL

This behavior is as misguided as fortran IMPLICIT directives. It does not catch typos or no longer existing variables. Instead it silently keeps going with a NULL value until it might break many, many lines of code where the issue actually is.

Note that using get() is not an option. It's a burden, and defeats the purpose of the OO approach.

wch commented 3 years ago

This behavior is as misguided as fortran IMPLICIT directives.

There are better ways to make your case. What this says is that you are willing to rudely demand something without considering the implementation or the trade-offs, and without considering the possibility that someone else has thought about these things.

The reason R6 behaves this way is because R6 objects are environments, and that's how environments behave:

e <- new.env()
e$x
#> NULL

In order to do the check that you want, we would need to add a $.R6 method, but that adds significant overhead when doing any operation with an R6 object.

Here's a benchmark of how long it takes to access a field using $:

library(R6)
library(microbenchmark)

A <- R6Class("A", public = list(x=1))
a <- A$new()

microbenchmark(a$x, unit = "ns")
#> Unit: nanoseconds
#>  expr min  lq   mean median    uq  max neval
#>   a$x 723 734 827.04    737 774.5 6942   100

If we add a $.R6 method which does what you want, it adds a lot of overhead:

`$.R6` <- function(x, name) {
  if (!exists(name, x, inherits = FALSE)) {
    stop(name, " does not exist")
  }

  .subset2(x, name)
}

microbenchmark(a$x, unit = "ns")
#> Unit: nanoseconds
#>  expr  min   lq     mean median   uq      max neval
#>   a$x 2482 2528 188969.8 2588.5 2673 18620208   100

And that's why R6 doesn't do it. If you want to define a $ method for your own class, you're free to do that. For debugging purposes, you could define $.R6 like above.

I actually agree that it would be better if R6 threw an error when trying to access members that don't exist. But I don't think it's worth the performance cost to do it by default.

The same issue was discussed here: #34

stefanoborini commented 3 years ago

@wch There's a point where performance should take a step back in favor of delivering reliable code. This is one of those times. At least define an alternative accessor so that those who don't want to engage in this broken semantic can use it.

Or define a flag in the R6Class function so that one can choose which behavior to use for the environment, assuming it's possible to do so.