r-lib / R6

Encapsulated object-oriented programming for R
https://R6.r-lib.org
Other
410 stars 56 forks source link

R6-encapsulated Shiny observer prevents the encapsulating object from being finalize()-ed #232

Open mmuurr opened 3 years ago

mmuurr commented 3 years ago

I'm not sure if this is an Issue (capital I), but I also can't find a good pattern here for when I embed a Shiny observer inside an R6 object. Take these two classes:

WithoutObs <- R6::R6Class("WithoutObs",
  private = list(
    finalize = function() {
      print(sprintf("%s: running finalize()", class(self)[1]))
    }
  ),
  public = list(
    initialize = function() {
      ## NOP
    }
  )
)

WithObs <- R6::R6Class("WithObs",
  private = list(
    obs = NULL,
    finalize = function() {
      print(sprintf("%s: running finalize()", class(self)[1]))
      private$obs$destroy()
    }
  ),
  public = list(
    initialize = function(rx) {
      private$obs <- shiny::observe(print(rx()))
    },
    destroy_obs = function() {
      private$obs$destroy()
    }
  )
)

Now when creating a few objects, then dropping our references to those objects, I'd expect the finalizers of both to run. Note that the finalizer for WithObs destroys the observer.

rxval <- shiny::reactiveVal(0)
without_obs <- WithoutObs$new()
with_obs <- WithObs$new(rxval)

rm(without_obs, with_obs)
gc()
# [1] "WithoutObs: running finalize()"
#          used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
# Ncells 513748 27.5    1138677 60.9         NA   658348 35.2
# Vcells 941482  7.2    8388608 64.0      16384  1804906 13.8

with_obs's finalizer doesn't run. If we now flush Shiny and then garbage-collect now we get:

shiny:::flushReact()
# [1] 0  ## observer runs because the finalizer didn't run and thus didn't destroy()
gc()
#            used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
# Ncells  577265 30.9    1138677 60.9         NA   713492 38.2
# Vcells 1082047  8.3    8388608 64.0      16384  1804907 13.8

... still no finalizer.

In a brand new session, if we explicitly destroy the embedded observer, we can get the finalizer to run, but it requires a reactive flush, almost suggesting that the encapsulating object's finalizer is blocked until all other (external to the object) references to its fields have also been released?

rxval <- shiny::reactiveVal(0)
without_obs <- WithoutObs$new()
with_obs <- WithObs$new(rxval)
with_obs$destroy_obs()  ## destroy the observer ahead of time

rm(without_obs, with_obs)
gc()
# [1] "WithoutObs: running finalize()"
#          used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
# Ncells 513755 27.5    1138697 60.9         NA   658348 35.2
# Vcells 941486  7.2    8388608 64.0      16384  1804921 13.8

shiny:::flushReact()
gc()
# [1] "WithObs: running finalize()"
#          used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
# Ncells 516129 27.6    1138697 60.9         NA   658348 35.2
# Vcells 944092  7.3    8388608 64.0      16384  1804921 13.8

I'm surprised by both (i) the non-execution of with_obs's finalizer on the first gc() above and (ii) the need to both explicitly destroy the encapsulated observer and flush the reactive system to finally get the encapsulating object to finalize().

FWIW, my use case that led me down this path is pretty much exactly what you see with the WithObs class: an encapsulated observer that I want to destroy() when the program loses its reference to the encapsulating object -- specifically to try to prevent memory leaks by not accumulating unused/unneeded observers during a program lifecycle.

EDIT: forgot to mention, this is with R6 v2.5.0.

wch commented 3 years ago

I believe the problem has to do with the creation of the observer:

    initialize = function(rx) {
      private$obs <- shiny::observe(print(rx()))
    },

When observe(print(rx())) is called, under the hood it creates a function like this:

function() {
  print(rx())
}

The enclosing environment of that function is the environment from which observe() is called from. This is needed for the user's code inside observer to be able to access objects from that environment. In this case, the enclosing environment is the "inside" of the initialize() method. So the observer holds a reference to the created function, which contains a reference to the "inside" of the initialize() method, which contains a reference to the R6 object.

So as long as the observer exists, it will keep the R6 object alive.

That much probably isn't surprising. What may be surprising is that the "destroyed" observer exists until flushReact() is called. For example:

library(shiny)
o <- observe(message("this is the observer"))
reg.finalizer(o, function(e) message("observer finalized"))
o$destroy()
rm(o)
invisible(gc())

shiny:::flushReact()
invisible(gc())
#>  observer finalized

I'd have to think about this some more before I'd say it's a bug. There might be a good reason why it's like that. Can you file an issue on Shiny about this?

mmuurr commented 3 years ago

@wch Yeah, happy to create an issue in the Shiny repo simply to document this behavior, at least.

I did eventually come to that realization that the observer's newly-created function (via shiny::exprToFunction()) is what was holding the reference to the encapsulating R6 object, but also couldn't see a nice pattern for how to resolve this. In the end (just to overcome a hurdle for the moment), I simply implemented a free() method on the R6 object that calls destroy() with a mental reminder to call free() whenever I'm done with such an object -- harkening to C/C++'s *alloc/free pattern, despite this being pretty antithetical to a garbage-collecting runtime :-)

If one forgets to call free() prior to the encapsulating object losing any named reference, it seems quite tricky to avoid a 'memory leak' of sorts where both the observers and the R6 object continue to accumulate in a long-running program due to a circular reference path.

In any case, thanks for taking a look, and if I can be of any help in working through this (including designing test cases), feel free to ping me.

wch commented 3 years ago

One workaround is to create the observer so that its code is evaluated in a different environment. Then the resulting function won't capture the execution environment of initialize(). For example, this uses the global environment, and the gc() does collect the WithObs object.

WithObs <- R6::R6Class("WithObs",
  private = list(
    obs = NULL,
    finalize = function() {
      message("WithObs: running finalize()")
      private$obs$destroy()
    }
  ),
  public = list(
    initialize = function(rx) {
      private$obs <- shiny::observe({
          print(rx())
        },
        env = globalenv()
      )
    }
  )
)

rxval <- shiny::reactiveVal(0)
with_obs <- WithObs$new(rxval)

rm(with_obs)
invisible(gc())
#> WithObs: running finalize()

Note that this change does not cause the private$obs object to be GC'd. (You can test this with a finalizer on that observer, but be careful to make sure that the finalizer function itself doesn't capture some environment that keeps the object alive.)

Also note that with future versions of Shiny, the observe() function's env and quoted arguments will be deprecated. The way forward will be to use quosures, like this:

    initialize = function(rx) {
      myquo <- rlang::new_quosure(
        quote({ print(rx()) }),
        env = globalenv()
      )
      private$obs <- rlang::inject(shiny::observe(!!myquo))
    }