r-lib / R6

Encapsulated object-oriented programming for R
https://R6.r-lib.org
Other
404 stars 56 forks source link

Returning invisible(self) in print method prevents the object to be gc'ed #140

Closed Enchufa2 closed 6 years ago

Enchufa2 commented 6 years ago

I'm not sure about what's going on here. This works ok:

library(R6)

Person <- R6Class("Person",
  public = list(
    initialize = function() {
      self
    },
    print = function() {
      cat(paste0("Hello, world!\n"))
      # invisible(self)
    },
    finalize = function() {
      message("Finalizer has been called!")
    }
  )
)
Person$new()
gc() # still in .Last.value
gc() # Finalizer has been called!

The object is gc'ed in the second call as expected, because in the first call, the object is still accessible through .Last.value. But if we return invisible(self) from the print method, as #125 says and as every print method in R does, the object is not gc'ed:

library(R6)

Person <- R6Class("Person",
  public = list(
    initialize = function() {
      self
    },
    print = function() {
      cat(paste0("Hello, world!\n"))
      invisible(self)
    },
    finalize = function() {
      message("Finalizer has been called!")
    }
  )
)
Person$new()
gc() # still in .Last.value
gc() # nothing...

But if we print something else, then the object is gc'ed!

print(1)
gc() # Finalizer has been called!
wch commented 6 years ago

I suspect the root cause is that R is doing some sort of caching for printed objects. I'm not sure whether this is a bug in R, or if there's a good reason for it.

The same thing happens with this code:

print.foo <- function(x, ...) {
  cat("print.foo called\n")
  invisible(x)
}

new_foo <- function() {
  e <- new.env()
  reg.finalizer(e, function(e) message("Finalizer called"))
  class(e) <- "foo"
  e
}

new_foo()
gc() # still in .Last.value
gc() # nothing

If the last line of new_foo() is changed from e to invisible(e), then it does run the finalizer on the second gc().

wch commented 6 years ago

It might actually be due to caching in R's S3 dispatch mechanism. For example, this executes the finalizer:

ident <- function(x) invisible(x)

new_foo <- function() {
  e <- new.env()
  reg.finalizer(e, function(e) message("Finalizer called"))
  class(e) <- "foo"
  e
}

ident(new_foo())
gc()
gc() # Finalizer called

But this doesn't:

ident <- function(x) UseMethod("ident")
ident.foo <- function(x) invisible(x)

new_foo <- function() {
  e <- new.env()
  reg.finalizer(e, function(e) message("Finalizer called"))
  class(e) <- "foo"
  e
}

ident(new_foo())
gc()
gc() # Nothing

After running the previous block of code, any of these 3 code snippets will trigger the finalizer.

# Calling `ident` generic on another object:
ident.default <- function(x) invisible(x)
ident(1); gc()

# Calling `print` on another object:
print(1); gc()

# Calling a completely different S3 generic on another object:
abc <- function(x) UseMethod("abc")
abc.default <- function(x) invisible(1)
abc(1); gc()

Calling another non-S3 function before the gc() does not work:

identity(1); gc()  # Nothing
sum(1); gc()       # Nothing
invisible(1); gc() # Nothing
Enchufa2 commented 6 years ago

You are right, nice analysis!

So, is this a bug or a feature? Do you think I should report this in R-devel?

Enchufa2 commented 6 years ago

BTW, closing this, as it has nothing to do with R6.

wch commented 6 years ago

It might be worth reporting to r-devel, or at least asking if it's expected behavior.

Enchufa2 commented 6 years ago

I was about to post on R-devel and...

Did you try your examples (and mine) in a plain R console? I think it's RStudio...

gaborcsardi commented 6 years ago

For me it fails in the terminal as well, but only if you run it in a fresh R session.

Enchufa2 commented 6 years ago

Ok, I'm trying to gather the final bits. It was R 3.2.3 (don't ask...) the one that succeeded. R 3.4.3 fails miserably with or without RStudio. It's R after all. Someone with quick access to R 3.3.x?

Enchufa2 commented 6 years ago

Checked. It fails on R 3.3.2 too. Writing to R-devel...

Enchufa2 commented 6 years ago

For the sake of completeness, if somebody lands here, Luke Tierney explains the issue here. It turns out that it has nothing to do with S3 dispatch, but with an internal register being protected as a result of a explicit return. The following example shows the same issue without S3:

new_foo <- function() {
  e <- new.env()
  reg.finalizer(e, function(e) message("Finalizer called"))
  e
}

bar <- function(x) return(x)

bar(new_foo())
gc() # still in .Last.value
gc() # nothing

The internal register gets cleared as soon as there is another explicit return:

bar(1)
gc() # Finalizer called

Note that the important bit is the explicit return. If we remove it, the issue disappears:

new_foo <- function() {
  e <- new.env()
  reg.finalizer(e, function(e) message("Finalizer called"))
  e
}

bar <- function(x) x

bar(new_foo())
gc() # still in .Last.value
gc() # Finalizer called!