exercism / nim-test-runner

GNU Affero General Public License v3.0
2 stars 3 forks source link

build: use `alpine` for parent image, not `nimlang/nim` #64

Closed ee7 closed 3 years ago

ee7 commented 3 years ago

~I'll tidy up a few more things (e.g. the builder names; symlinking; maybe use just two build stages instead of three), but I think this PR is pretty close to complete.~

The resulting Docker image is now smaller than any Nim image I found in the wild. I have some ideas to improve it further, but that'll be for a later PR.

The commit message is reproduced below.


This commit reduces the size of our Docker image. Previously, we were using an image from the nimlang/nim Docker Hub repo that itself uses alpine:latest. We now use an alpine image directly, thereby removing some features/dependencies we don't need in the test runner environment.

Image size before this commit: 249.53 MB Image size with this commit: 139.66 MB

Some advantages:

Disadvantages:

This commit removes from the image:

To build the Nim compiler, this commit adds a new script - this is better than writing the commands in the Dockerfile directly because we can:

This commit also adds a shellcheck workflow.

ynfle commented 3 years ago

Is the 5 minutes delay worth it? Do we need the optimizations performed here?

ee7 commented 3 years ago

Is the 5 minutes delay worth it?

Absolutely. Improvements at run-time are often worth costs at build-time. And here I'd say that the improvement is big and the cost is small - ~I can probably minimize that extra build time anyway~ (Edit: see below post).

Do we need the optimizations performed here?

"need" is a strong word, but a near-2x reduction in the image size is a pretty big win. For security reasons, it's best practice for an image to contain as little as possible. This is especially important for our use case: running untrusted, user-provided code. Containers have enough security problems already.

And from the exercism docker docs:

The Dockerfile should create the minimal image needed for the tooling to function correctly and speedily.

The Dockerfile should produce an image with as a small a size as possible while maximising (and prioritising) performance.

The end-user may also benefit from a faster container startup time, depending on how things are deployed.

It also doesn't hurt financially. Exercism is using Amazon ECR, but hopefully Exercism optimizes its usage to suit their pricing anyway. (Edit: apparently our current usage of ECR is indeed "basically free")

I should have mentioned that I can add caching to the CI workflow, which theoretically means we'd only see that extra time when e.g. the Nim version is updated. We already have caching in the deployment workflow, see: https://github.com/exercism/nim-test-runner/blob/54cbe1311af4797dec22a3d2bbbe0f0c23708b97/.github/workflows/docker.yml#L19-L28

The main hurdle there is that the cache is deleted if it isn't accessed for a week (that's pretty short - it was 45 days on Travis CI). ~We can keep it alive via a scheduled job, which might be nice to have anyway.~ Edit: Actually, we'd want the scheduled job to periodically rebuild the cache, at least while we're using Alpine.

ee7 commented 3 years ago

Made some tweaks - ready for review.

I'd like to omit caching for now - we can consider adding it later.

Another problem with adding caching is that we want to use the latest versions of packages in order to get security updates (we can't reliably pin package versions on Alpine anyway), which means that we want to periodically invalidate the cache. We can do that by either:

(Note that this also applies to the existing caching in the deployment workflow).

But overall, I think it would take longer to set-up + test + maintain than the time it would save, so it's a low priority for me.