carpentries / workbench

Repository for Discussions and Materials about The Carpentries Workbench
https://carpentries.github.io/workbench/
Creative Commons Attribution 4.0 International
17 stars 7 forks source link

create docker container #39

Open zkamvar opened 1 year ago

zkamvar commented 1 year ago

There have been calls for the creation of a docker container that folks can use for The Workbench so that they can use something similar to make docker.

The docker container should be relatively straightforward to implement and we can implement two flavours:

  1. slim with just R, pandoc, and the necessary packages with room to expand for other packages. This would be used if people want to render without previews.
  2. fully featured with RStudio, R, and pandoc so that folks can work to preview their lessons.

Both of these can be based on containers from the Rocker Project or the base image for the R-Universe project where we can set the MY_UNIVERSE environment variable.

zkamvar commented 1 year ago

I've added a docker/ folder that contains a dockerfile and readme. It should be improved.

aw1231 commented 1 year ago

I am not a docker expert by any means, but the new WASM (https://docs.docker.com/desktop/wasm/) beta makes me think we might leverage this to speed up install/setup time, because as I understand it, the WASM shim is able to run without having to worry about architecture which is something I've encountered with docker in the past. Unfortunately using it drops compatibility with older docker containers. Thoughts?

zkamvar commented 1 year ago

This looks neat! Alas, I don't think we can leverage WASM yet because support for R in WASM is currently limited to https://github.com/georgestagg/webR#demo, and while it is possible to install and use R packages in the WASM, the process is extremely difficult at the moment (https://github.com/georgestagg/webR/issues/11). Moreover, there is no support for pandoc in WASM that we can use.

aw1231 commented 1 year ago

Ah, true. Well, hopefully as it comes out of beta, things can change. I'd be happy to help with implementing the container. From above, essentially make two containers, one that contains R, pandoc, etc. and another fully featured one with RStudio, yes?

zkamvar commented 1 year ago

Building the Container

Sort of. I think building the container is fairly straightforward (at least it will be far less complex than https://github.com/carpentries/lesson-docker because we will be adding packages on top of an existing system instead of trying to shoehorn the jekyll build system in to the RStudio container or vice-versa).

Right now the bare-bones Docker setup is https://github.com/carpentries/workbench/blob/d134ed1cb85a09b4a5e95d97597350a86503cec2/docker/Dockerfile, which uses the R-Universe container as a base (though preview is not possible).

I believe an equivalent one with RStudio included would be to replace r-universe/base with rocker/rstudio from the rocker project (or rocker/geospatial for the r-geospatial lessons): https://rocker-project.org/images/versioned/rstudio.html

User interaction with the container

The remaining issue of how to mount volumes, expose ports for previewing, and figure out how to get it to write to the system not as the root user is another matter that is likely solved with docker-compose, but that's currently beyond my technical capabilities (see https://rocker-project.org/images/versioned/rstudio.html#how-to-use for an overview of the array of possibilities)

One big caveat is how to handle the packages for lessons that use R-markdown. Since {renv} uses a global package cache with symbolic links, all of those symbolic links will be broken inside of the container and thus, while we mount the lesson itself inside of the container, we need to ignore the renv/profiles/lesson-requirements/renv/ (which I think is addressed by https://stackoverflow.com/questions/29181032/add-a-volume-to-docker-but-exclude-a-sub-folder, but it either requires a very long user command or a docker-compose file).

harivyasi commented 1 year ago

Hi! I am looking into it and would be happy to help :)

aw1231 commented 1 year ago

HI @harivyasi, thanks! I've had a busy past few weeks, but next week I'll likely sit down and start on the docker-compose setup. Let me know how you get along, and I'll be happy to add to anything you are working on.

alee commented 11 months ago

I have a naive docker + docker-compose setup at https://github.com/alee/python-novice-gapminder/commit/0de50c735f4e32f9da953eb3cc278de6e96ff0b9 that should work for lessons that don't have the R-markdown issue raised earlier. Adding a secondary bind mount volume that supports a container-local {renv} should be pretty straightforward though by adding another bind mount for renv:./renv/profiles/lesson-requirements/renv/ or something similar into the docker-compose.yml file. What's a good r-markdown lesson to test that on though?

Not sure I understand why rstudio support would be needed, is it for the R-markdown lessons?

The image name in docker-compose.yml should also be templated for each specific lesson.

trhallam commented 10 months ago

Following on from @alee I've got a working docker image and compose setup for a Workbench lesson. The image used as base is the rocker/rstudio:4.3.1 image. This image takes care of many issues related to permissions by mapping the user who started docker compose up to any mounts from the docker image itself.

A hook could be added to the Rprofile to call sandpaper::serve() after the project is opened if necessary.

A couple of issues

Link to lesson

zkamvar commented 8 months ago

Note: in sandpaper 0.15.0, you can now use the SANDPAPER_SITE environment variable to move the site path to a different directory to avoid permissions conflicts (see https://carpentries.github.io/sandpaper/news/index.html#sandpaper-0150-2023-11-29)

fherreazcue commented 8 months ago

We were not aware of this discussion taking place, but we built a docker image that is working and tested both on linux and mac. We went for a minimal R approach, to try and keep the size of the container as small as possible. The Docker file is available on the repo.

You can get the latest versions with: For linux: docker pull ghcr.io/uomresearchit/sandpaper:latest For mac: docker pull ghcr.io/uomresearchit/sandpaper:latest_arm

And can be run (from the lesson's base directory) with:

docker run -p 4321:4321 -v $PWD:/siteroot/ ghcr.io/uomresearchit/sandpaper:latest

Note: make sure you have the necessary directories before running the container:

mkdir -p instructors/{data,fig,files} learners/{data,fig,files} profiles/{data,fig,files}

Because it is on a bind mount, you can edit the lesson files and see the site updated a few seconds later (at http://localhost:4321/ ).

alee commented 4 months ago

Thanks @fherreazcue - I just tried this out and it is working great! I'm using it with a minimalist docker-compose.yml e.g.,

services:
  server:
    image: ghcr.io/uomresearchit/sandpaper:latest  # switch to `latest_arm` for macos
    volumes:
      - ./:/siteroot
    ports:
      - "127.0.0.1:4321:4321"

and it works like a charm and much smaller than the 5 GB R images that were being generated earlier :sweat_smile: