Open bkmgit opened 2 years ago
Hi @bkmgit, potentially, yes, this could fit into HPC Carpentry. However, so far, I've only taught this Singularity material as part of a 2-day workshop covering Docker and Singularity (Reproducible computational environments using containers). I don't know if others may have taught this material alongside other modules but I suspect probably not, so far - this material is reasonably new and is still evolving.
One challenge with the current material in the context of HPC Carpentry is that for building Singularity images, we use Docker. This works great alongside the Docker lesson because it allows the learners to make practical use of the Docker knowledge they've learnt the previous day. For HPC Carpentry, I guess Docker would not be on the syllabus and so this will complicate things since Docker knowledge would need to be a prerequisite for the current material.
The reason we use Docker is that to have access to Singularity on a system where the learners have admin/root access presents a challenge since they obviously then need a platform where they can install Singularity which really limits things to Linux at present. Many attendees will be using Mac/Windows systems. With Docker we can run a container from the Docker Singualrity image and use this to build Singularity images.
I suppose an alternative might be to provide access to pre-provisioned cloud resources with Singualrity installed, perhaps providing one node per student where they can have admin access?
Just a quick update to my previous comment to say that I've seen your related issue in the docker-introduction repository - certainly if you were looking at covering cloud-based HPC workloads and therefore wanted to cover Docker too, that would provide a good basis for also using this material.
I don't have experience of teaching HPC Carpentry myself but maybe one of the other maintainers for this or the Docker lesson can offer some comments/advice on this.
Thanks for your response. One of the aims is to have a set of related lessons that can be offered in a workshop. Typically not all lessons are used in a single workshop since workshops are tailored for their audience.
Have you also looked at Podman, this has been evaluated directly for HPC as well https://vhpc.org/static/PapersPresentations2020/iscworkshops2020_paper_12.pdf - but probably using it to make containers is also fine. Docker on Windows seems to require WSL https://docs.docker.com/desktop/windows/install/ , so this is probably also ok as a requirement for Podman https://podman.io/getting-started/installation.html
@bkmgit We have not looked at Podman as the HPC systems we have been using for the courses so far only have Singularity available. I am not sure how different Podman is from Singularity - I guess the level of difference (both conceptually and in practice) would be what decides if this is simply a different track (e.g. using snippets like on HPC Intro) or whether a separate lesson would be the right choice.
Also, Singularity seems to be much more common than Podman on HPC systems from my experience so far. Though this is obviously influenced by the systems I have had access to, which have generally been in the UK. Maybe the prevalence of Podman vs Singularity is different in different countries.
My understanding is that people learn to use Docker first, and then learn to use Singularity. Could they possibly have the choice of learning Podman first and then Singularity? Even though Podman does not require root privileges, HPC integration for Singularity is better than Podman.
I think the problem with this approach is that Docker is the most generally useful container tool so it is the best option to teach people first. I would not be keen to replace teaching Docker with teaching Podman for this reason.
To follow up on @aturner-epcc's comments, while there are pros and cons to both Docker and Podman, a major benefit of Docker in the context of both the docker-introduction module and this singularity module is the maturity and cross-platform nature of the Docker tooling.
Attendees of these courses tend to be using a wide range of systems covering Windows, Mac and Linux. For end-user interaction with Singularity where administrative access to the host system is not required, we generally provide access to a suitable HPC platform (with Singularity installed) when running the course. However, for elements that require the learners to have administrative access, e.g. building Singularity images, since Singularity only supports Linux at present, this presents a problem.
Being able to run a Singularity container within Docker that provides a Singularity environment where the learners have administrative access has proved very useful to support this. The ease of installing Docker across Windows, Mac and Linux is the key aspect that makes Docker the most suitable tool to enable this.
I'm assuming we are likely to see more container platforms appearing over time and I do think we should keep in mind the possibility to cover/support different platforms, or at least highlight their existence within the lesson material. However, in the context of this lesson, with Podman being Linux-only at present, this would add another layer of complexity (either requiring the running of Podman inside Docker for non-Linux users, or requring the provision of remote Linux nodes to course participants), hence I agree that sticking with Docker as the underlying container environment used to support the Singularity training material is currently the best option.
Might this lesson be something that could be used in an HPC Carpentry workshop?