Closed jrolli closed 2 years ago
This is similar to something I've been thinking about, and I'm curious what you think. It would be a significant change in the way challenges are written, but there would be some advantages:
Currently there are the built-in challenge types (node, pybuild, flask, etc.), and when these are built, their corresponding internal Dockerfile is used, e.g. https://github.com/ArmyCyberInstitute/cmgr/blob/master/cmgr/dockerfiles/flask.Dockerfile for a Flask challenge. This has the nice property that challenge authors don't have to think about the underlying Dockerfile and basically works like the old challenge classes from Hacksport.
But, I think there are potentially some reproducibility issues with this, especially in environments like the picoGym where we keep challenges running for a long time and might occasionally need to switch servers and rebuild them, etc. For example, flask
challenge builds created before May 11 would use flask==1.1.4
, while the same builds recreated today would use flask==2.0.1
. Over time, I imagine base images will need to change as Ubuntu versions become deprecated, etc.
One way around this could be to:
1) Pin all base images/dependencies in the included Dockerfiles, by sha / specific version / etc.
2) Make creating a challenge work more like create-react-app <name>
or django-admin startproject <name>
, where it would copy a whole embed.FS with starter files (like the current examples in the repo) for a given challenge type to disk, something like:
$ cmgr start flask my-flask-challenge
# Created template directory ./my-flask-challenge
$ ls my-flask-challenge
# app.py challenge.md Dockerfile README.md solver/ templates/
The advantage of this would be that since the Dockerfile is part of the challenge rather than baked into the binary, the underlying Dockerfile would remain constant no matter when this challenge was rebuilt, and the template Dockerfiles included in the cmgr
binary could be smoothly upgraded over time without breaking older versions of challenges, since they would be copied fresh as of their current state whenever a new challenge was started.
Also, since what is currently in the examples/
directory of the repo would essentially be immediately visible when someone starts working on a new challenge, it might be even easier for authors than needing to flip back and forth to documentation, since in many cases they could just tweak the example challenge.
This is just a thought, though - I'd be open to collaborating on something like this if you would be interested, but I also recognize that a major change like this is probably an iffy proposition.
I'll admit I'm kind of torn on this. I think in general the transparent upgrading is what we want (allows developers to use new features & patches vulnerabilities that are likely unintended), but reproducibility is definitely an issue. However, I think the only way to get reproducibility is to save/cache the final image/artifact since nothing guarantees the repos hit during first build will continue to serve the content you downloaded (pip is better about this, but the apt repos are definitely a problem).
Can you help me understand (out-of-band, if necessary) what your flow for "switch servers and rebuild them" would look like? In particular, I'm curious if adding support to store images in a private repository or adding some kind of replication process would be more robust since they would allow preserving the container as is across rebuilds.
Thoughts?
Wanting to keep the transparent upgrade behavior is fair (and it's true that using apt makes complete reproducibility pretty difficult if not impossible, though I guess apt packages should usually behave pretty well within a given Ubuntu version.)
As an example of when this might arise as an issue, say that we're running challenge servers (hosts running cmgrd) A and B, but eventually for whatever reason need to decommission them and redeploy all of their challenges on new server C instead. It would be ideal if the same challenge directory inputs originally deployed on A and B were guaranteed to produce the same build image outputs on C without worrying about potential changes to the challenges (granted, having robust solve scripts would help mitigate this, but that's historically been... difficult to get authors to provide).
As a more concrete example, we currently have a bunch of old Hacksport challenges built for Ubuntu 16.04, but adding them to the Gym now would mean redeploying them against 18.04 (or 20.04, or whatever), so we'd have to manually test and potentially modify them, especially if they depend on specific glibc behavior, etc. And this problem will continue to arise as underlying dependencies change over time, whether it's Hacksport's built-in challenge classes or cmgr's built-in Dockerfiles.
I do see the appeal of being able to patch bugs in the base challenge types without manually editing each challenge, like we did here – although even then, with both Hacksport and cmgr, to my understanding it's necessary to manually rebuild any existing derived challenges to include updates made to the base class, so this mostly helps with new challenges, and not as much in a case like the picoGym where the same challenges are kept running indefinitely.
Having cmgr push/pull build images to a remote registry might go a long way towards solving the hypothetical server A/B/C scenario above, since we would just connect C to the same existing registry and avoid rebuilding any existing images. But, that does assume that we'd never want to increase the number of builds for a certain challenge from say 5 to 7, in which case the two newly-built images might differ significantly from the existing five.
So, I'm not sure there's any perfect solution to the reproducibility/patchability tradeoff, and I do understand if you don't want to take cmgr
in that direction. It's just something I've been thinking about re: supporting Gym challenges long-term.
(I also had a vague idea about potentially adding something like npm audit
and tagging the base images as a way to split the difference between reproducibility/patchability, but I haven't fully fleshed it out:)
$ cmgr audit
# 1 outdated challenge found
# Challenge `flask-sqlite` uses outdated base image cmgr/flask-base:1.0 (current: 1.2)
$ cmgr upgrade ./challenges/flask-sqlite
# Rewrote ./challenges/flask-sqlite/Dockerfile
$ cmgr test ./challenges/flask-sqlite # OK
$ cmgr update
The usecase as you describe it makes sense, and I think there might be a way to get there in a semi-flexible manner.
All of the pre-made challenge types have a comment along the lines of "End of shared layers" in them, and at this point the challenge (should) not need to hit the Internet to install anything else unless it is doing custom package fetching in a makefile/elsewhere. It should be possible to tag and save this layer into a private repository to freeze the base image while providing a high likelihood that creating new builds should just work.
That said, I'm not sure what the right interface for this should be. I think having a "freeze" command (exposed in cmgr
and cmgrd
) is likely the right path, but the semantics and statefulness of it deserve some deliberate thinking. Basically, the proposed workflow for frozen/challenge
would be:
cmgr test frozen/challenge
and cmgr playtest frozen/challenge
cmgr freeze frozen/challenge
from a cmgr
instance that points to the repo and has write permissions (developer box?)cmgr
& cmgrd
will first check for a frozen version in the private repo and use that as the starting point.I think there is a lot of hidden complexity in making step 5 actually work, but I think that would move things in the correct direction.
I definitely like the idea of an optional freeze
command. That makes a lot more sense compared to disrupting the existing challenge creation workflow. And freezing the already-built layers neatly sidesteps the apt pinned versioning issue.
A private repo would be nice due to layer deduplication, but perhaps even just dumping a tarball of the frozen upper layers back into the challenge directory upon freeze
could work, with the advantage that the challenge data and its frozen base image would be stored together...
Sorry for sort of hijacking this issue with this discussion! We'd be putting cmgr
challenges in the picoGym next March at the earliest, so this isn't an immediately pressing issue, but between now and then I'll spend some additional time thinking about the freeze
idea.
During development, it is useful to manually execute the docker process to get better debug logs and experiment some. Process would be easier if
cmgr dockerfile [challenge type]
(or something similar) would print/save the requested dockerfile.