opensafely-core / research-template-docker

This provides the devcontainer Docker image used by repos created from the OpenSAFELY research template.
Other
0 stars 0 forks source link

Add test of dev container #13

Closed StevenMaude closed 2 months ago

StevenMaude commented 3 months ago

This PR partly deals with opensafely-core/codespaces-initiative#18 by:

StevenMaude commented 3 months ago

This needs:

Jongmassey commented 2 months ago

check the actual R packages (depends on https://github.com/opensafely-core/research-template-docker/pull/17, I think)

I've backed out of this for a number of reasons (see that PR). Happy to help discuss alternative approaches next week if needs be.

Jongmassey commented 2 months ago

Re: the R packages. The test I'd conceived was broadly "are all the R packages installed as expected?". In an ideal world, the expected == the packages from the renv in the R action image.

However, as seen in #17 it's not quite that simple. There are base packages and those that ship with the Rocker image, neither of which are in the renv in the R action image. In a similarly ideal world, we'd be able to cleanly remove the latter, but that's a bit more complicated.

We would not expect the base package list to change unless we bump the R version, nor would be particularly expect those that ship with Rocker to change. Therefore, defining the these extra packages from these two sources in the test as the expected "extra packages", means that if either of these do change then the test will fail, which I think is a signal we want.

This could probably be simplified by just looking at package counts, something a bit like

opensafely_r_package_count=$(curl -s https://raw.githubusercontent.com/opensafely-core/r-docker/main/renv.lock | jq  '.Packages | length')
r_installed_package_count=$(Rscript -e 'installed_packages <- as.data.frame(installed.packages()[,c(1,3:4)])' \
    -e 'package_names< unique(installed_packages[installed_packages$Priority!="base"|is.na(installed_packages$Priority),,drop=FALSE][["Package"]])'
    -e 'cat(length(package_names))')
# this is the count of packages in R base + those shipped in the rocker base image
extra_package_count=17
expected_package_count=$(($opensafely_r_package_count + $extra_package_count))
if [[ $r_installed_package_count -ne $expected_package_count ]]
   then
      # ... throw an error

17 might not be right, it's just the number I had in mind

Jongmassey commented 2 months ago

Done a bit more investigation and there's WAY more simplification to be done.

a) use packages.csv as the comparison rather than the renv config file. This is a better source for "what packages are in the OpenSAFELY R Action image" as each of these packages is requested to be installed during the image build. Where these packages are already installed (i.e. in r base or recommended) they are not added to the renv config.

curl -s https://raw.githubusercontent.com/opensafely-core/r-docker/main/packages.csv | cut -d, -f1 | sed 's/\"//g'

b) don't do any filtering of the R package names. If we use the more comprehensive list in a) we don't need to exclude various package sources (aka "Priorities).

cat(unique(as.data.frame(installed.packages()[,c(1)])[,1]),sep="\n")

This leaves only 3 discrepancies: docopt, littler, and spatial. The first two are added in the Rocker dockerfile here and here (alas not in a readily parseable way). Spatial is a mystery as it's a recommended package:

> find.package("spatial")
[1] "/usr/local/lib/R/library/spatial"
> installed_packages[installed_packages$Package=="spatial",]
        Package Version    Priority
spatial spatial  7.3-13 recommended

I think it's reasonable to add these three to a "known differences" list in the test. If the test fails because either of the upstreams have changed the package list (i.e. Rocker or OpenSAFELY R action) then I think that's a good thing because those are both things we really want to be aware of.

StevenMaude commented 2 months ago

For what it's worth, it doesn't seem entirely unreasonable to install spatial in the R Docker image, just for the consistency of having all of the CRAN r-recommended packages in there.

Although if no-one's asked for it so far, it seems equally reasonable to leave it as an exception here.

Jongmassey commented 2 months ago

For what it's worth, it doesn't seem entirely unreasonable to install spatial in the R Docker image

I agree

StevenMaude commented 2 months ago

Think all that's left:

StevenMaude commented 2 months ago

It's not possible to run the scheduled workflow from this PR. So I checked it in a fork instead by triggering manually. (The commit ID doesn't match to this repository, but the workflow's the same.)

A screenshot showing the GitHub Actions run for the devcontainer check passing, and the workflow file contents.

StevenMaude commented 2 months ago

It's probably possible to simplify the commit history, but I've left the sausage making on display here to show the process I went through.

Fine to do that tidying, if it's warranted.

Jongmassey commented 2 months ago

decide when the R Studio tests should be run (at image build, or on scheduled run of a research-template dev container; maybe in the dev container is fine?)

Feels like one of the most likely things to cause these tests to fail is changes to the R image. The weekly build is intended to pick up changes to this image. Therefore, testing it when this is done feels sensible