CI with sarus requires docker-in-docker style containers

I'm migrating some projects to use sarus as part of their CI pipelines, which so far works well.

However, some projects have tests set up to generate and run srun [options] sarus [image] [test_command] commands. When the entrypoint of CI is a container, this cannot work unfortunately, as slurm and sarus are not available inside the container themselves. The current workaround is (1) to extract the generated srun commands from the container, and execute them in a normal shell outside of the container, or (2) collect the srun commands outside of CI and have some trusted person review and update them by hand. But option (1) defies the purpose of containers and option (2) is not dynamic enough when new tests are added frequently, especially when pull requests are opened and multiple versions of the software are around.

In many CI environments it is common practice to let the user specify the container in which to start execution, which can in fact be a docker-in-docker container, so that the user can run docker run ... inside of the container without having to escape it.

I think a similar solution with sarus + slurm would be incredibly useful as well: we could make the entrypoint of CI a sarus-in-sarus container, where the user could allocate resources, download images, and run them through sarus.

What I would like to know is whether this would be easy to manage, and whether this comes with performance or security penalties.

eth-cscs / sarus

CI with sarus requires docker-in-docker style containers #10