carpentries-incubator / docker-introduction

Reproducible Computational Environments using Containers
https://carpentries-incubator.github.io/docker-introduction/
Other
44 stars 48 forks source link

reading /proc/version in Practice Makes Perfect exercise reports host OS details #163

Closed colinsauze closed 2 years ago

colinsauze commented 2 years ago

In the "Practice Makes Perfect" section of the Running Containers episode the learners are asked to find what version of Linux the busybox image is running. The lesson tells them to look at the file /proc/version, but this file reports the same as the host OS and if they are running on Windows might mention WSL2. Was it intended to be finding out information about the host, rather than the container here?

aturner-epcc commented 2 years ago

@colinsauze Thanks for spotting this. I suspect that what has happened here is that this has been well tested on macOS terminal and Windows Powershell (neither of which have a /proc file system as they are not Linux based) and this works. When run on Linux, you get the host /proc file system and you see the behaviour that you identified.

I think the simplest solution is to remove this task from the "Practice makes perfect" exercise and just have the task to find out what busybox does. @sstevens2 @jcohen02 What do you think?

sstevens2 commented 2 years ago

I'm a bit confused by this issue. Are you saying on a Windows computer, when you cat /proc/version inside the busybox container, it prints back the host computer info instead of the container os info?

jcohen02 commented 2 years ago

As @colinsauze has pointed out, it does seem that the question in the "Practice Makes Perfect" section seems to be asking the learner to identify the "version" of Linux, by which I assume it perhaps means the distribution rather than the kernel version, that is running in the busybox container.

If I understand correctly, the "content" of /proc/version is generated (or at least provided) by the kernel via the procfs when it is read. As such, I assume that this should always be calling into the kernel on the host system since containers use the host system kernel, and returning something specific to the host system? A quick test shows that this seems to be the case on Linux and I assume on other platforms, the value returned is Docker implementation-specific and depends on how the kernel abstraction is being presented to Linux containers running on other platforms?

I guess that the actual file that should be used to answer the question is /etc/issue or /etc/{os}-release where {os} can be either "os" or the container OSs name, or perhaps some other value?! 🙂 Since the very cut-down setup in the busybox container doesn't contain either of these files, I'd agree with @aturner-epcc's suggestion that we either remove this part of the material altogether, or we could suggest the use of some other cut down container with a minimal size that does provide an /etc/issue or /etc/os-release file.

sstevens2 commented 2 years ago

Ahh! I didn't understand what it was showing there was kernel related not os related. I think maybe we should edit the exercise to look for a different piece of information inside the container. If we remove it I think we need to replace it with another exercise that has them practice running containers.

aturner-epcc commented 2 years ago

@sstevens2 My proposal will not remove the exercise totally, just the part of the exercise that fails. If there is a simple thing we can add that keeps it having two parts (to replace the removed part) then I am happy to add this but I could not think of anything obvious. Short term fix is to remove the incorrect part and use this issue to identify a better exercise in the future.

Are you happy with this proposed approach?

jcohen02 commented 2 years ago

I was taking a look at other minimal linux containers to see of there's a suitable alternative for this example. I couldn't see anything that looks like a realistic option that's anywhere near as small as the alpine or busybox images but actually the Ubuntu base images are not all that big - ubuntu:20.04 is only ~27MB to pull and then it expands to around 80MB I think.

Maybe this is acceptable to ask learners to download? If they have a network connection that enables them to successfully participate remotely in a course then I'd assume that a download of this size should be feasible? If it's an in person course, I presume we can assume that it will be held in a location with acceptable internet connectivity.

If we were to go with this, we could change the use of /proc/version to /etc/os-release and then the exercise could perhaps ask learners to look at the operating system release details for Ubuntu 20.04?

Happy if you want to have a think about this and, for now, temporarily remove the part of the exercise that isn't working as expected.

colinsauze commented 2 years ago

What is the aim of the exercise? Is it to show that Docker is using the kernel of the host OS (which I think gets a bit confusing on non-Linux systems where you have some additional virtualisation between Docker and the host). Or is it to find out about the operating system running in the container?

If it is the latter then I like @jcohen02's suggestion of changing to /etc/os-release. But it is nice to keep to a minimal OS like alpine as it downloads much faster. If it's the former then I think this is probably beyond what needs to be known to learners at this level and @sstevens2 suggestion of deleting the exercise is appropriate.

sstevens2 commented 2 years ago

@colinsauze the goal of the exercise is that they practice running containers. I don't think it matters what command they run in the container but it would be nice if it ran an easy (preferably familiar to most people) command in a container.

aturner-epcc commented 2 years ago

How about replacing the use of the alpine image with the python-slim image for this exercise? It is a bit larger (44 MB compared to 2.5 MB for alpine) but not large enough to cause issues with image download I do not think. This would give us much more scope for allowing people to run commands to explore the image that are actually closer to what they would do anyway (e.g. what version of Python is installed, which packages are available in Python).

sstevens2 commented 2 years ago

@aturner-epcc I went to give this a try and noticed slim is a tagged version of python. I'm hesitant to use that for this exercise now because we haven't explained tagged images yet. Also checked the sizes on my computer and python:slim is 123 MB and ubnutu:latest (as of this post) is 72.8MB. I'm going to go ahead and make a PR to the lesson using ubnutu and we can see if it gives us any issues when teaching.

aturner-epcc commented 2 years ago

Sounds good to me