Porting the autograder image to alpine linux

wlnirvana commented 4 years ago

When getting suggested buffer size withsysconf(_SC_GETPW_R_SIZE_MAX) on some OSes (e.g. Alpine linux with musl, FreeBSD, etc.), -1 is returned, causing autodriver to exit. This PR sets the buffer size to 1024 based on the POSIX doc

wlnirvana commented 4 years ago

Note that the modifications made so far are not enough for basing the autograding docker image on alpine linux, primarily because busybox does not satisfy the setuid requirement of the autodriver binary.

autodriver is designed to run as user autolab instead of root, but relies on setuid being set for itself to gain root privileges. This enables the binary to use utilities like mv, chown, etc. to manipulate data not belonging to user autolab.

Unfortunately, alpine's core utility busybox will automatically drop privileges for mv, causing this line to fail. I tried to bypass the issue with a manually compiled qucik-and-dirty replacement for mv. It worked, but autodriver failed at chown then.

I haven't checked all the commands used by autodriver, but I doubt they will work. Meanwhile, manually replacing all the binaries just seems too much effort and error-prone.

fanpu commented 4 years ago

Thanks for spotting this issue, and the incredibly detailed comments! The changes for autodriver.c looks good to me, and right now the main issue is with the setuid for the autodriver when you are running on Alpine. I understand this PR is still a WIP, so I will ignore the changes to Makefile first. I will be following along, since getting autograding woring on Alpine is a huge performance boost and something that I haven't been able to get successfully working in the past before (due to the fact that Autolab was still on Rails 4 back then and there were dependencies that were too old and didn't exist in libraries available for Alpine) and so it's something that I'm definitely very excited about if it's actually possible.

wlnirvana commented 4 years ago

I've made a prototype base image that works with the hello demo lab in my local environment. Changes committed mainly are:

Reordered the autodriver compiler flags, in particular LDFLAGS. This flag is empty on Ubuntu but that's fine because argp is built-in on Ubuntu. However, alpine needs a separate argp-standalone package to be installed and explicitly linked, so the linking flag order matters. To maintain compatibility with Ubuntu, the flag is only set in the dockerfile.
Removed sudo from the makefile, because alpine does not have sudo installed by default. And more importantly, it is not the Makefile's responsibility (except in full-fledged make install command) to manage the permission settings, but can be done in dockerfile as with the case of current ubuntu docker file.
Minor updates in autodriver.c to port it to alpine linux.
A sample alpine based dockerfile with gcc and libc-dev undeleted, so that it can run against the hello lab. The setuid issue is worked around by installing coreutils and findutils to replace the busybox symlinks.

Also note that instead of cloning the entire Tango repo, I used wget to download the updated autodriver source code FROM MY PERSONAL FORK of tango.

fanpu commented 4 years ago

Our CMU sysadmin in charge of managing our deployments mentioned that a way to get around the Busybox dropping privileges issue is to set the uid to 0 inside autodriver.c, either at the start of the program, or alternatively inside call_program after fork is performed and before execv. As of now the removal of the setuid bit on autodriver inside the Makefile is the most contentious part of this PR as I'm not sure if it may break any existing forms of deployment, so if the workaround can be adopted that would be much safer

wlnirvana commented 4 years ago

Manually setting uid to 0 worked in my previous experiments (a PoC one, not directly on autodriver.c), but I was concerned that code changes in autodriver.c would break the system or expose security vulnerabilities (note that this program will run make and user submitted code) on other testing backends like EC2, ssh, etc. That's why I opted for a more conservative approach.

I don't really get the Makefile point, though. For dockerised grading images,autodriver binary is copied over like in the following dockerfile, where permissions must be set again on the copy destination rather than source:

https://github.com/autolab/Tango/blob/e56a6c6497d4ab71e62f6c6dad347fa404dc2476/vmms/Dockerfile#L23-L24

This effectively invalidates any permission manipulation in makefile.

I'm not sure about other deployments like EC2, local machines, etc. But as long as the autodriver executable is copied, it should not break the system.

fanpu commented 4 years ago

I agree with you that the setuid in the Makefile is pointless for Docker deployments. Apparently the setuid thing was something that was implemented a long time ago and might have been to have something to do with deployments on (Tashi VMs)[https://www.cs.cmu.edu/~jclopez/ref/tashi-acdc09.pdf], because they might have thought that SSH-ing into Tashi VMs and executing autodriver as a non-root user with setuid is safer than allowing root login, but to be honest no one on the current team really knows why it is there. I know this isn't entirely helpful in resolving the issue but I hope it might shed some light on the situation.

wlnirvana commented 4 years ago

That makes sense. Any update on setting uid to 0 in autodriver.c v.s. installing coreutils and findutils?

wlnirvana commented 4 years ago

Closing this PR as I am synchronizing my master branch with Tango upstream master. Changes committed are checked into feature/porting-autograding-to-alpine branch.

autolab / Tango

Porting the autograder image to alpine linux #167