rojopolis / spellcheck-github-actions

Spell check action
MIT License
132 stars 38 forks source link

Dockerfile - use multi-stage for performance increase & specific language installation #39

Closed aSemy closed 3 years ago

aSemy commented 3 years ago

Thanks to @edumco for his support in #13

From what I understand, building a whole docker image for spellcheck is slow. There's a few different (and complementary) ways of fixing it. Here's my suggestion: multi-stage docker builds.

There's a base docker image that can be hosted online. This contains everything that's needed. Then when a user needs to spell check a specific language, docker-multistage fires up and takes that base image (which is just a quick cacheable(?) download), and very quickly installs the required language.

Also, I used the new COPY --chmod to combine setting permissions with the copying.

Commands for building

You can create a new image with any language you want by specifying the aspell language code in a comma-separated-list build-arg:

docker build -t spellcheck --build-arg SPELLCHECK_LANGS="en,fr,hr" .

The base image can be made specifically by setting the target --target, and tag it with -t

docker build --target spellcheck-builder -t spellcheck-builder .

For testing performance of a fresh build, disable the cache with --no-cache

docker build --no-cache --target spellcheck-builder -t spellcheck-builder .

Disclaimer

By the way I have very little idea about what I'm doing, or if this is even the 'best' approach. I haven't even tested it (just hacked around locally). I wanted to try learning Docker multistage, and it was fun trying!

If you have points to improve, I would really appreciate it if you picked them up and ran with them - I don't have a lot of time for contributing. And honestly I feel a bit rude, dumping a PR without testing or really being a user or part of the project. But anyway, I hope that's okay.

Final note, I found https://github.com/wagoodman/dive a really good tool for investigating layers.

jonasbn commented 3 years ago

Thanks I will review the proposed changes

edumco commented 3 years ago

@aSemy there is some conflict. Can you check it?

aSemy commented 3 years ago

Fixed (bumped the base image to python:3.9.4-slim)

edumco commented 3 years ago

@aSemy I tried to run build and run and found 2 problems:

chmod flag on COPY

The flag was not recognized on the default docker version which comes on Ubuntu 20.04 LTS

When we don't specify an user on Dockerfile all commands are executed as root so you don't need change file permissions to execute them

COPY entrypoint.sh /entrypoint.sh
...

ENTRYPOINT has to call sh or bash passing the script as parameter

The file entrypoint.sh is not a binary, it is a script interpreted by a shell. So we have to call the shell (sh is the most universal) to execute the script.

...
ENTRYPOINT ["/bin/sh", "/entrypoint.sh"]

As soon as possible I will make a pull request with a pipeline to build and check the docker image. Until there can you fix this?

aSemy commented 3 years ago

Thanks for finding those issues! I've pushed a fix to remove the --chmod and set the shell to sh (I had copied those from the current Dockerfile)

edumco commented 3 years ago

@aSemy whats missing to fix the conflict on code?

Can you rebase and review?

jonasbn commented 3 years ago

@edumco and @aSemy

The base image has been updated from python-3.9.4-slim to python-3.9.5-slim