reactorlabs / rir

GNU General Public License v2.0
62 stars 18 forks source link

Extra pool in Code objects prevents S4 objects from collection #1288

Open fikovnik opened 1 month ago

fikovnik commented 1 month ago

Consider the following code (simplified version of the R methods package test):

A <- setRefClass("A")
A$methods(f = function(i) { print(i) })

bug <- TRUE

# class B inherits from A
B <- setRefClass("B", contains = "A")
B$methods(
  # destructor that sets the global flag `bug` to FALSE
  finalize = function() { bug <<- FALSE },

  # method that calls `f` from the superclass
  g = function() {
    # R at its best - first pull out the method `f` from super class into the current environment
    usingMethods("f")
    # call it using lapply - the match.fun in lapply will resolve the string to a closure
    lapply(1, "f")
})

# new instance of class B
b <- B()

# call B$g which in turn calls A$f printing 1
b$g()

# removes b
rm(b)

# since b is not referenced from anywhere, it should be collected
# and have its finalizer called
gc()

# check that the finalized has run
stopifnot(!bug)

In vanilla R it works well. On master it works well as long as the number of type feedback slots for the observed callees is less than 13.

Here is what happens:

The way the observed callees recording works is that first the closure gets stored in the extra pool of the corresponding caller code object, i.e. the 'lapply' function, and then the index of this pool entry gets stored in the type feedback. Normally, we keep just 3 callees per call site. Since lapply is used for a whole lot of things before we exhaust the available slots way before we call the b$g(). However, when we increase the number of slots, the bug manifests. It even manifests when PIR is disabled (PIR_ENABLE=off)

This happened when @stepam38 added the support for the contextualized type feedback.

IMHO: It is a fairly weird test: relying on a call to gc() to collect an object is wrong. Having finalized together with gc also wrong.