ColinFay / r-ci

Docker images for Continous Integration / Continuous Delivery for R Projects
26 stars 3 forks source link

isolate dev helpers from package dependencies #3

Open maxheld83 opened 5 years ago

maxheld83 commented 5 years ago

this is a great project; I've been looking for docker images suitable for R CI/CD recently and have build my own (based off of r-hub) for github actions.

In ghactions, I try to isolate the package development helpers (things like pkgbuild etc.) from the actual dependencies of the actual package in question, to avoid clashes and unexpected interactions between the two (say, package foo requiring a different version of pkgbuild than the container etc.).

I do this by always installing development helpers to some location not in .LibPaths() by default, and then only append it to .LibPaths() as needed or even just requireNamespace() from that other directory. Details are here https://r-lib.github.io/ghactions/articles/isolation.html

I've found that helpful to better reason about what is happening.

Does that seem like a good/interesting idea for you for r-ci as well?

Obviously, feel free to just close this if it doesn't seem relevant/right to you.

ColinFay commented 5 years ago

Ah yeah that definitely makes sense to work that way.

I haven't thought about the possibility of the dev package conflicting with the test package dependencies themselves.

So basically the idea is that if I have a package that depends on a version of {devtools}, running the test in the Docker might conflict with the {devtools} version installed inside the Docker 🤔

So, for example in gitlab, we would like to do:

r-3-5:
  stage: test
  image: colinfay/r-ci-tidyverse:3.5.0
  script:
    - R -e 'remotes::install_local()'
    - R -e 'devtools::check()'

and that the devtools:: here is a different one from the one that is possibly installed by remotes::install_local().

Do you think that's possible to make it happen with the dockefile itself, or should it be left to the end user?

maxheld83 commented 5 years ago

Let me just describe what I'm roughly doing in ghactions right now and see what you think of that. (More details here https://r-lib.github.io/ghactions/articles/isolation.html).

I bake all development helpers into some directory in the Docker image, which is not in .libPaths() by default, such as /usr/lib/R/devhelp-library (can be anything). This path would also be baked into the image as an environment variable at build time in (say, R_LIBS_DEV_HELPER).

By default, say a docker run -it foo/img:latest Rscript -e 'devtools::check()' would then actually fail and find no devtools. The user (= developer) could override this in two ways:

  1. pass env=R_LIBS="$R_LIBS_DEV_HELPER which would prepend .libPaths() with the action helpers. This would be relatively unsafe, but get you quick access to all the helpers. We might also default to this, and let users unset this env variable, but that might defeat the purpose of isolating-by-default.
  2. call some requireNamespace(lib.loc = Sys.getenv("R_LIBS_DEV_HELPER")) before calling devtools. The container could include some helper functions with syntactic sugar to make this easier and safer to do (kind of like withr). This would offer some isolation between calls (vertically down the screen, so to speak) but not within the call tree (behind the screen, metaphorically speaking). I'm exploring how this might be done in https://github.com/r-lib/ghactions/issues/212 (not right now though).

Any packages that the CI/CD script (~ the user) installs would just go to the default pkg location and be readily available there, though it the package search tree would be prepended when env=R_LIBS="$R_LIBS_DEV_HELPER is set.

This is currently working pretty alright inside of ghactions, but I want to spend some more time exploring it / letting it marinate before I recommend this to others. I might be missing something and/or this might be too unwieldy.

So in your example this might look like so:

r-3-5:
  stage: test
  image: colinfay/r-ci-tidyverse:3.5.0
  script:
    - R 
    - R -e "requireNamespace('remotes', lib.loc = Sys.getenv('R_LIBS_DEV_HELPER')); remotes::install_local()"
    - R -e "requireNamespace('remotes', lib.loc = Sys.getenv('R_LIBS_DEV_HELPER')); devtools::check()"

This way, we'd never have anything lying around in the search tree that wasn't explicitly mentioned in the DESCRIPTION by the pkg developer, and we'd be sure that the CI script always uses the "vanilla" versions of package dev helpers it (~the image) ships with and has been tested with. If users absolutely need to use a non-vanilla (say github) version for the development helper pkgs (that should be pretty rare), they can always adapt the script to these needs, but it wouldn't be the default.

And if they install a bunch of things which happen to depend on non-vanilla development helper pkgs, then that wouldn't affect the script actions.

By the same token, people could/should then stop (abusing?) Suggests: for development helpers, because pkgs such as pkgdown don't enhance the pkg for pkg users at all.

I actually need to think some more whether this makes sense for other CI/CD services. Makes some sense in ghactions, but it's got a bit of a different paradigm then Travis or GitLab CI AFAIK.

What are your thoughts? Does this make sense?

maxheld83 commented 5 years ago

I now have this running and a bit documented at https://github.com/maxheld83/r-ci

This doesn't offer the different versions of R etc that you had (I don't need that for now), but it does give a sense of the isolation I was shooting for.

It's also driving https://github.com/r-lib/ghactions

Let me know what you think.

maxheld83 commented 5 years ago

so, turns out I build myself a nice self-own here – relying on *Namespace() is a bad idea for several reasons listed in https://github.com/r-lib/ghactions/issues/272.

It's better to just change the .libPath() using withr::with_lib_path() or even just setting R_LIBS=(safest of them all).

There are still some benefits to installing dev helper packages on their own path as I have done, but the advantages and possible isolation are fairly limited unless/until we can think of something much more fundamental (sketched here https://github.com/r-lib/ghactions/issues/212)