Closed gaborcsardi closed 7 years ago
Btw. this PR also help accessing private members from the finalizer, which is quite hacky currently, as I can see you need to extract the private env manually: a$.__enclos_env__$private
PTAL.
I just had a thought: What about inheritance? Say that class A has a finalizer, and class B inherits from A. What should happen then?
YEah, that's a good one.
I think the "usual" thing is sensible: the finalizer is still called, and you can override it in the child. It might be working like this already, although even then it is worth adding a test case.
If A and B both have finalizers, I can imagine cases where B's finalizer should override A's, but I can also imagine some where both finalizers should run. In the latter case, could the finalizer invoke super$finalizer()
?
And some tests of this would be nice :) (including tests with two levels of inheritance)
If A and B both have finalizers, I can imagine cases where B's finalizer should override A's, but I can also imagine some where both finalizers should run. In the latter case, could the finalizer invoke super$finalizer()?
Yes, I suspect that it already works like this, finalize
is just a method like any other, after all.
And some tests of this would be nice :) (including tests with two levels of inheritance)
I'll write them in a minute.
@wch Btw. what did you have in mind for two levels? There are a lot of combinations to test with two levels....
I've been thinking about it some more, and I think I've convinced myself that B's finalizer should have to explicitly call super$finalize()
if it wants to invoke A's finalizer.
Here's an example of that illustrates the other side: why it can be awkward for B to have to invoke A's finalizer explicitly. Suppose A is a class that represents a database connection with a finalize
method that closes the connection. And B is a subclass that buffers the reads (or something like that). That means that B (and every other subclass of A) will need to have a finalize
that calls super$finalize()
, even if B's finalizer does nothing else. And it's also possible to create a subclass of A which has broken behavior -- if the subclass has no finalize
, it will leave the connection open.
On the other hand, if B's initializer wants to invoke A's initializer, the it needs to call super$initialize()
. This is the current behavior. I think it makes sense for finalizers to have behavior that mirrors this.
For two levels: suppose C inherits from B, which inherits from A. Then if C's finalize
calls super$finalize()
, it should invoke B's finalize
. And if B's finalize
calls super$finalize()
, it should invoke A's finalize
.
I don't think it makes sense for C's finalize
to be able to call A's finalize
directly. Did you have any other combinations in mind?
TBH, I am not sure what you prefer now. :) Sorry.
Right now
super$finalize
.initialize
.I think this is all quite good. It is safe and flexible. In your DB example, B does not have to define a finalizer just to call A's finalizer. OTH if it defines one to do sg extra, it can still call the parent's finalizer via super
.
I'm almost done with the test cases, 5 more minutes....
For two levels: suppose C inherits from B, which inherits from A. Then if C's finalize calls super$finalize(), it should invoke B's finalize. And if B's finalize calls super$finalize(), it should invoke A's finalize.
Got it, so all three have finalizers and we destroy an object from the grandchild.
Oh right, I was confused -- I forgot that B would just inherit A's finalizer if it didn't define one. I agree that what you outlined sounds good, and I'm glad it just works that way!
Never mind, it was good to think this through. And also, to write the test cases, which are done btw.
As for two levels of inheritance, you could test the case when B does not define a finalizer, only A and C, etc. so there are a lot of cases.
Hmmm, this reminds me, what if we call super$finalize()
and the superclass does not have a finalize
method? I'll check this out in a minute.
Hmmm, this reminds me, what if we call super$finalize() and the superclass does not have a finalize method? I'll check this out in a minute.
OK, this fails, as expected. On the other hand it also fails for initialize
, so maybe we want to fix them both at the same place?
I think it is good defensive programming to just call super$initialize
and super$finalize
, especially if another package (and person) provides the superclass, so ideally these should always succeed. Although
if (!is.null(super$finalize)) super$finalize()
is a simple workaround, so probably not a big deal.
Wait a sec, I am fixing some test labels.
OK, should be all good now.
Maybe we could generate the non-portable test cases from the portable ones, because it is easy to make copy-and-paste errors. :)
Looks good!
Just a quick comment sharing my experiences with "automagic" finalizers:
Make sure to test that the finalizers are re-entrant. There's a risk that finalizers depend on code that may not be available when R runs it. For instance, an object may still be around, but all package code may have been unloaded when it's finalizer runs. This becomes especially complicated when they run when R itself exits, cf. from help(".Last")
:
Exactly what happens at termination of an R session depends on the platform and GUI interface in use. A typical sequence is to run .Last() and .Last.sys() (unless runLast is false), to save the workspace if requested (and in most cases also to save the session history: see savehistory), then run any finalizers (see reg.finalizer) that have been set to be run on exit, close all open graphics devices, remove the session temporary directory and print any remaining warnings (e.g., from .Last() and device closure).
Then there's the following comment from help("reg.finalizer")
:
R's interpreter is not re-entrant and the finalizer could be run in the middle of a computation. So there are many functions which it is potentially unsafe to call from f: one example which caused trouble is options. As from R 3.0.3 finalizers are scheduled at garbage collection but only run at a relatively safe time thereafter.
Problems often shows up as non-reproducible sporadic errors that are quite hard to troubleshoot, because how and when finalizers are run is hard to predict.
I went through the above a while ago with R.oo - required lots of trial-and-error development and very defensive coding.
Just my $.02
@HenrikBengtsson Thanks, this is very useful! Maybe it could be even in the docs.
@HenrikBengtsson Reading now the R.oo code, wow!
Don't look to close; that piece of code is ad hoc and patchy - it's an instance where "if it works - don't change it" truly applies.
@HenrikBengtsson What I am thinking about is that R6 objects are pretty self-contained, and usually do not need anything from the R6 package.
What might happen is that an R6 class needs its (package) environment to work. E.g. my process
class needs the processx
package environment. If this environment is not around after gc, that indeed restricts what we can run in the finalizer.
I'll experiment with this.
@HenrikBengtsson Very interesting, thanks for sharing your experiences. These do sound like really tricky problems to diagnose.
FWIW, when I run this code in an R session:
library(R6)
print(loadedNamespaces())
e <- new.env()
reg.finalizer(e, onexit = TRUE, function(e) {
cat("==== Finalizer ====\n")
print(loadedNamespaces())
})
.Last <- function() {
cat("==== .Last ====\n")
print(loadedNamespaces())
}
q()
It doesn't unload any of the packages before .Last()
or e
's finalizer are called. (Tested on R 3.2.3 on a Mac.)
$ R
> library(R6)
> print(loadedNamespaces())
[1] "R6" "graphics" "utils" "grDevices" "stats" "datasets"
[7] "methods" "base"
>
> e <- new.env()
> reg.finalizer(e, onexit = TRUE, function(e) {
+ cat("==== Finalizer ====\n")
+ print(loadedNamespaces())
+ })
NULL
>
> .Last <- function() {
+ cat("==== .Last ====\n")
+ print(loadedNamespaces())
+ }
> q()
==== .Last ====
[1] "R6" "graphics" "utils" "grDevices" "stats" "datasets"
[7] "methods" "base"
==== Finalizer ====
[1] "R6" "graphics" "utils" "grDevices" "stats" "datasets"
[7] "methods" "base"
Classes can define a public finalize method, and this will be called automatically when objects are garbage collected.
The destructor is called with no arguments, but of course it can refer to
self
andprivate
, and if the class is not portable, then it can also just refer to members.