r-lib / R6

Encapsulated object-oriented programming for R
https://R6.r-lib.org
Other
403 stars 56 forks source link

Unexpected modification of original object in its deep clone #213

Closed gaow closed 3 years ago

gaow commented 3 years ago

The code below takes hours to run on my computer (Ubuntu 20.04 5.4.0-39-generic):

library(R6)
Xobj <- R6Class("X",
  public = list(
    initialize = function(N,J,R) {
    private$X = array(1, dim = c(N,J,R,R))
    for(r in 1:(R-1)){
    for(t in (r+1):R){
        private$X[,,r,t] = 0
        private$X[,,t,r] = 0
    }
    }
    }
  ),
  private = list(
    X = NULL
  )
 )

st = proc.time()
N = 500
J = 500
R = 50
X = Xobj$new(N,J,R)
et = proc.time()
print(et-st)

However if I move the core code out of R6, it takes 12 seconds:

st = proc.time()
N = 500
J = 500
R = 50
X = array(1, dim = c(N,J,R,R))
for(r in 1:(R-1)){
  for(t in (r+1):R){
    X[,,r,t] = 0
    X[,,t,r] = 0
  }
}
et = proc.time()
print(et-st)

This "overhead" is very concerning. I'm starting to worry about other bits of my R6 code that might suffer from similar yet less noticeable performance issues, which hurts the performance of my software. @wch your insight is very much appreciated!

wch commented 3 years ago

Hi, you can make it go faster if you use portable=FALSE and then instead of private$x[y] <- z, you use x[y] <<- z for assignment

This is just as fast as your second example:

library(R6)
Xobj <- R6Class("X",
  portable = FALSE,
  public = list(
    initialize = function(N,J,R) {
      X <<- array(1, dim = c(N,J,R,R))
      for(r in 1:(R-1)){
        for(t in (r+1):R){
          X[,,r,t] <<- 0
          X[,,t,r] <<- 0
        }
      }
    }
  ),
  private = list(
    X = NULL
  )
)

st <- proc.time()
N <- 500
J <- 500
R <- 50
X <- Xobj$new(N,J,R)
et <- proc.time()
print(et-st)

This came up in another issue, and I explained why it happens here: https://github.com/r-lib/R6/issues/201#issuecomment-583486168

The documentation for R6 should be updated with information about this.

gaow commented 3 years ago

@wch thanks for the pointer. I have changed my code to non-portable and using <<- assignment for all private variables. However, there seems to be a major issue related to clone on inherited objects. For example the following code is problematic:

library(R6)
Yobj <- R6Class("Y",
  portable=FALSE,
  public = list(
    set = function(v) X<<-v
  ),
  private = list(
    X = NULL
  ),
  active = list(
    data = function() return(X)
  )
)

Xobj <- R6Class("X",
  portable=FALSE,
  inherit = Yobj
)

Then use Xobj in code,

> x=Xobj$new()
> x$set(2)
> x$data
[1] 2
> y = x$clone(T)
> y$data
[1] 2
> y$set(3)
> y$data
[1] 3
> x$data
[1] 3

as you can see the last line is expected to be x$data == 2 not x$data == 3.

If I change the code to not using <<- it seems fine:

library(R6)
Yobj <- R6Class("Y",
  portable=FALSE,
  public = list(
    set = function(v) private$X = v
  ),
  private = list(
    X = NULL
  ),
  active = list(
    data = function() return(private$X)
  )
)

Xobj <- R6Class("X",
  portable=FALSE,
  inherit = Yobj
)
> x=Xobj$new()
> x$set(2)
> x$data
[1] 2
> y = x$clone(T)
> y$data
[1] 2
> y$set(3)
> y$data
[1] 3
> x$data
[1] 2

This behavior is puzzling. I tried to fix it via a manual deep_clone without luck so far. But I think this should be handled on the developer's end. For now I'll have to stick to my slow implementation and @wch I look forward to your help.

wch commented 3 years ago

Hm, that appears to a be a bug. I'll look into it.

wch commented 3 years ago

@gaow I have a potential fix in #215. I'll need to add more tests before merging.

gaow commented 3 years ago

Thank you @wch it apparetly worked for my case -- two versions are now getting identical results. Are you planning on a new cran release after testing and merging ?

wch commented 3 years ago

Yes, I'll try to release soon. I want to add tests for more levels of inheritance.

wch commented 3 years ago

Closed by #215