Open Enchufa2 opened 7 years ago
Also note that it works correctly with a named function:
func <- function() 1
foo <- Foo__new() %>%
Foo__add(func)
rm(foo)
gc()
# Warning: foo destructor called
I'm unable to replicate this with R6 classes or reg.finalizer()
.
In this call:
library(magrittr)
foo <- Foo__new() %>%
Foo__add(function() 1)
rm(foo)
gc()
The parent environment of the function created through function() 1
appears to be not what one would expect -- e.g. I see
Browse[2]> environment(fun)
<environment: 0x110d74518>
rather than R_GlobalEnv
. Doing some debugging, that appears to be the same environment associated with function_list[[k]](value)
in the call stack.
Browse[2]> as.data.frame(cbind(sys.calls(), sys.frames()))
V1 V2
1 Foo__new() %>% Foo__add(function() 1) <environment: 0x103d504a8>
2 withVisible(eval(quote(`_fseq`(`_lhs`)), env, env)) <environment: 0x110d74ce0>
3 eval(quote(`_fseq`(`_lhs`)), env, env) <environment: 0x110d74f10>
4 eval(expr, envir, enclos) <environment: 0x103d50860>
5 `_fseq`(`_lhs`) <environment: 0x110d750d0>
6 freduce(value, `_function_list`) <environment: 0x110d742e8>
7 withVisible(function_list[[k]](value)) <environment: 0x110d74438>
8 function_list[[k]](value) <environment: 0x110d74518>
9 Foo__add(., function() 1) <environment: 0x110d746d8>
My guess is that this ultimately comes down to something going wrong with how magrittr
modifies the environment of functions it executes.
Moreover:
Browse[2]> ls(envir = environment(fun), all.names=TRUE)
[1] "."
Browse[2]> environment(fun)$.
<pointer: 0x3369e20>
Browse[2]> foo_
<pointer: 0x3369e20>
There it is the other copy of the pointer! That's why the destructor is not being called. Thanks, @kevinushey, for your help.
Is it the same issue with the update
branch? Has a simpler approach...
Just checked and yes, same issue.
Hmm it's not clear to me what's going on - one of the Rcpp experts may know... @romainfrancois ?
The gc
destroys an object when it loses all the references to it. In the first and third examples (no pipe or named function), the function's environment is the R_GlobalEnv
. Therefore, you remove foo
and the gc
destroys the object. In the second one (pipe + anonymous function), the function's environment is a new one created by magrittr
, and this environment stores another reference to the object. The function is stored inside the object. Thus, when you remove foo
, there is this other reference alive, and the object cannot be destroyed.
Seems to be an unhappy intersection of @Enchufa2 having code that would really like the object to be destroyed so that gc()
is called, and the non-standard approach to evaluating things requiring another copy preventing exactly this.
Sometimes one can fix the data type to add an explicit 'I am done, please unwind' member function to manually do some of the work of the destructor (at the cost of some local code complexity) -- but I am not sure one could do that here with Modules.
Seems to be an unhappy intersection of @Enchufa2 having code that would really like the object to be destroyed
Yes, basically, I have users' reports about machines running out of memory after many replications of a simulation model with simmer. I recommend them to reset the simulation environment, calling explicitly a function that deallocates resources, but I cannot rely on that.
Nevertheless, knowing what's the problem, I think I can implement in my package a workaround to transparently remove the reference that magrittr creates until this issue is solved.
Ok, I finally managed to find an example not involving Rcpp (sorry, guys, and thanks for your help). Consider the following not-so-useful function:
store <- function(env1, env2, func) {
env2$func <- func
env1
}
Then, as expected,
a <- new.env()
b <- new.env()
reg.finalizer(a, function(e) print("deleting 'a'"))
a <- store(a, b, function() 1)
rm(a); gc()
# deleting 'a'
rm(b); gc()
but
library(magrittr)
a <- new.env()
b <- new.env()
reg.finalizer(a, function(e) print("deleting 'a'"))
a <- a %>% store(b, function() 1)
rm(a); gc()
rm(b); gc()
# deleting 'a'
That's not ok. Finally, a workaround:
library(magrittr)
a <- new.env()
b <- new.env()
reg.finalizer(a, function(e) print("deleting 'a'"))
a <- a %>% store(b, function() 1)
rm(".", envir=environment(b$func)) # the trick
rm(a); gc()
# deleting 'a'
rm(b); gc()
That's ok again.
Sorry in advance because the minimal reprex is not that minimal: it involves
Rcpp
, and the issue I found is that some objects are not being garbage-collected because ofmagrittr
(I'm adding @eddelbuettel to the loop, just in case I'm mistaken and this has something to do withRcpp
instead). Let's consider the following mini-module:There is a class
Foo
that stores pointers toBar
, andBar
stores an R function, which is received throughFoo::add
. This code can be compiled withRcpp::sourceCpp(code='<code_here>')
. Then, let's play around:Foo
's destructor is correctly called and we see the warning. Perfect. But now,No warning! This is very serious, because the memory is not being deallocated. My guess is that
magrittr
could be messing with the environments and there is still a reference tofoo
somewhere, hidden, so that R is not removing the object. Is it possible?