Dynamic inheritance. Fixes #12 and fixes #13

wch commented 10 years ago

This PR makes inheritance dynamic. Now when R6Class(inherit=AC) is called, it stores only the unevaluated expression AC, and not a copy of the AC object itself. The expression is evaluated at object instantiation time, instead of at class creation time.

This also allows for explicit namespacing of superclasses, with something like R6Class(inherit=pkgA::AC).

gaborcsardi commented 10 years ago

Cool! Will try it in a minute!

Two things I want to check:

How scoping works with inheritance. I.e. if you have an object x in pkgA, and you refer to it in pkgA::AC's constructor, and then inherit in pkgB::BC. What if you have an object x in pkgB, what if you don't? What if you also reference an x in BC's constructor? I don't think you can completely avoid surprises here, but it would be great to see (and document) how exactly this works, and of course also avoid the big surprises.
Write a test case with actual packages. I would just create two temporary packages and check that there are no surprises with cross-package inheritance. I know that you can kind of simulate this with the parent_env argument, but it is still not completely the same, and I think it would be nice to have such a test case in the package.

wch commented 10 years ago

If BC inherits AC's initialize, it will find x in the pkgA namespace. If BC has its own initialize, it will find x in the pkgB namespace. Finally, if that constructor calls super$initialize, that will find x in the pkgA namespace. And even if there's class XC in between AC and BC, it should continue to work this way -- that is, if XC doesn't have its own initialize method, BC's super$initialize should be AC's copy, and find pkgA::x. Of course I won't complain if you test this out. :)
Check out the devtools tests for examples of basic packages that are installed, tested, and then removed. It's probably not a good idea to have the unit tests for R6 actually install and remove packages, but maybe those tests can be added in a directory of extra stuff.

gaborcsardi commented 10 years ago

Wow. Very-very cool.

What's wrong with installing packages in tests cases? The R temporary directory should always be available, and one could just install them there. Anyway, this is secondary now, and I'll check the devtools tests.

wch commented 10 years ago

I was concerned that the packages might go in a real libpath, but maybe that's not an issue. Even if it is, it could be worked around by adding a temp directory to the libPath, and then having an on.exit() call that restores the old libPath. We have something like that in devtools (it doesn't use on.exit(), but it should): https://github.com/hadley/devtools/blob/master/tests/testthat/test-shim.r#L72

gaborcsardi commented 10 years ago

Here is the infrastructure for testing with real packages: https://github.com/gaborcsardi/R6/commit/af5d6d0186059eaed04fb95b1acce9396ab74999

It took a long time to write it, and there is a lot of room for mistakes because of some messy base functions like install.packages and package.skeleton. Nevertheless, it seems to work well now. I understand if you don't (yet?) want to include this in the standard unit tests, but I think it will still be useful. So I can move it to a file called notest-inheritance.R, and then it will not be run by test() and R CMD check.

I will add more tests later.

wch commented 10 years ago

Very nice - I like the idea of having functions that create the packages automatically, and also install and remove them. I do think I'd like to keep these separate from the normal tests, although maybe they could be put in tests/test-extra/.

I had another thought about cross-package inheritance. If inside of pkgB, an instance of BC is created and stored in a variable, then that instance of BC will have a copy of methods inherited from AC. Not only will it have the methods - those methods will also have their enclosing environment, which is a child of pkgA. So because this instance of BC is created and saved in pkgB when pkgB is built, R would end up saving most of pkgA's contents (captured in the pkgA environment enclosing the methods inherited from AC) in the built pkgB.

Additionally, say pkgB 1.0 is built with pkgA 1.0. Then the user later installs pkgA 2.0. Then the instance of BC that's inside of pkgB would still be inheriting pkgA::AC 1.0.

I think the only way around these issues is for the pkgB author to instantiate the BC object at load time, instead of at build time.

gaborcsardi commented 10 years ago

[...] I think the only way around these issues is for the pkgB author to instantiate the BC object at load time, instead of at build time.

Ah, indeed. Creating instances at load time seems like a good workaround for me. Btw. how does it work with the :: operator? I guess the first time you use it, it will load the package and .onLoad is called.

Btw.2. the environment of methods is set to public, right?

I think saving the current state of the instance is not so bad. If it is documented and explained, it will be OK. Also, I guess it will not be that frequent, that you (1) inherit from a class in another package, and (2) create an instance of it in the package environment. For these rare cases the workaround is OK, imho.

wch commented 10 years ago

Two things:

I realized that every package that uses R6 to create a class will face this issue, because the $new() function is a closure created in the R6 namespace.

I ran some tests on what happens when you create closures across packages. The short version: the body of the closure ($new in the case above) is static and saved in the package, and any other environments are also saved, except for the R6 namespace environment. That environment is connected dynamically. The tests and results are here: https://github.com/wch/pkgClosure

In short, the thing I thought was a problem isn't a problem. But there's another possible problem, which is that the R6 $new() function expects its namespace environment to be different than it actually is, because the object was created with one version of R6, and then loaded with a different version of R6.

Btw.2. the environment of methods is set to public, right?

For non-portable classes, that's correct. For portable classes, the methods run in a separate enclosing environment. That environment contains self, which is the public environment, which is also how the outside world sees the object.

gaborcsardi commented 10 years ago

Interesting. Package environments seem to be similar to .GlobaEnv, which definitely behaves differently than "normal" environments, when it comes to saving. E.g.

rm(list=ls())
a <- 10
ls()
# [1] "a"
ls(.GlobalEnv)
# [1] "a"
save(.GlobalEnv, file="/tmp/xx.rda")
rm(list=ls())
ne <- new.env()
load("/tmp/xx.rda", envir = ne)
ls(ne)
# character(0)
ls()
# [1] "ne"

Also, when you just do this:

f <- function() "foo"
environment(f)
# <environment: R_GlobalEnv>

the environment of f is .GlobalEnv, but that is (obviously) not saved.

I guess something similar happens with the package environment, but that is a bit more difficult to test.

gaborcsardi commented 10 years ago

Btw. I put you experiments here: https://github.com/gaborcsardi/R6/commit/a89b117815d10b538012908ff14cacbe8a9f23eb

It is easier to play with it this way, no need to restart R.

gaborcsardi commented 10 years ago

Btw. I was thinking about whether we could give a warning/error about the pkgB breakage. We would essentially need to "hash" the package environment and store it in funB, the instance in pkgB. Or rather in pkgB itself. Then pkgB would be able to detect a mismatch at load (?) time. No, at load time it might not notice. Also, since this is only a problem if you rely on the environment, maybe it is useless to give a warning/error in all cases.

gaborcsardi commented 10 years ago

Btw again, so this would be essentially the same kind of warning (NOTE, actually), that R CMD check gives about "no visible bindings for global variable". In fact R CMD check might report our cases as well.

wch commented 10 years ago

Another possibility is to ask R about the version of pkgA and then save that somewhere in the object. But this might be all too complicated - perhaps it's better to just tell people to be aware of the potential problem.

It's possible also to add a function to R6 which would be used like this in another package

foo <- NULL  # To avoid "no visible bindings for global variable" note
runOnLoad(foo <- MyClass$new())

Where the function is something like this:

runOnLoad <- function(expr, env = parent.frame()) {
  pkg <- # something that gets the package name from the env
  expr <- substitute(expr)

  setHook(
    packageEvent(pkg, "onLoad"),
    function(...) {
      eval(expr, env)
    }
  )
})

gaborcsardi commented 10 years ago

My experience is that many people do not read the docs. :) Especially not the advanced sections. But they might look at the examples more often, so I think having runOnLoad is particularly useful if it is in the docs example(s).

I would not save the package version in the object, that seems messy, and does not always help. E.g. with packages from github, where the version number might not change for a long time.

The warning might be also overkill. I am not even sure how it would be possible, what function would warn you at all?

So in summary, I think having runOnLoad with example usage in the docs would be very useful.

Btw. in the docs, you probably want to say that all this does not only apply to packages, but also to saving the objects with save. Maybe even better to start with save, because it is simpler to understand than cross package inheritance.

wch commented 10 years ago

Yeah, you're probably right that people won't read the docs. :)

I'm not 100% sure that it's possible to add load hooks this way, by running setHook at the top level of a package, because setHook won't be run each time that the (built) package is loaded.

Another option is to provide some sort of object where they can add expressions and call them all later, as in:

e <- exprList$new()

foo <- NULL
e$add(foo <- MyClass$new())

.onLoad <- function(pkg, lib) {
  e$evalAll()
}

The simplest alternative is to provide an example where it just goes into .onLoad directly:

foo <- NULL

.onLoad <- function(pkg, lib) {
  foo <- MyClass$new()
}

gaborcsardi commented 10 years ago

Maybe runOnLoad could be implemented by saving promises into the package, but it even hurts to write this down, so it is probably not a good idea.

I actually do like the last, simple solution. And the rule is simple: if you inherit from another package, and want an instance in your package, then create your instance in .onLoad. That's it.

wch commented 10 years ago

@gaborcsardi I put some manual tests in tests/manual/. If you want to do a PR with your tests, that would be helpful.

gaborcsardi commented 10 years ago

Sure, I can do it tomorrow. Still need to write some tests, though, because I only have the framework, and 2-3 tests only. Anyway, soon.

On Wed, Aug 13, 2014 at 10:31 PM, Winston Chang notifications@github.com wrote:

@gaborcsardi https://github.com/gaborcsardi I put some manual tests in tests/manual/. If you want to do a PR with your tests, that would be helpful.

— Reply to this email directly or view it on GitHub https://github.com/wch/R6/pull/18#issuecomment-52137093.

r-lib / R6

Dynamic inheritance. Fixes #12 and fixes #13 #18