windsource / picus

Connects to Woodpecker CI and dynamically creates an agent in the cloud.
MIT License
39 stars 4 forks source link

Large container size #3

Closed ljoonal closed 1 year ago

ljoonal commented 1 year ago

TLDR: The ghcr.io/windsource/picus:0.3.0 seems to be around 112MB, which probably could be slimmed down a lot, or an alternative slimmer variant could be created.

But to explain in a bit more detail... First of all, this kind of auto scaling is a really good idea! So good in fact that I had implemented something similar myself already before discovering this :/

I usually try to write comparisons whenever I find projects that have similar functionality to those of mine, and found the docker image size to be even larger than what I expected. I thought I'd open an issue about it, since that's the only thing I think I'm currently doing better (my project's Dockerfile which comes down to less than 2MB currently for reference), and I assume it'll be quite easy to fix :'D If summarized basically the container could be built in multiple stages with lukemathwalker/cargo-chef:latest-rust-alpine for example, with the last layer being FROM scratch.

windsource commented 1 year ago

Hi @ljoonal, great that we both had the same idea on such a project.

I also value small images sizes so ususally I use Alpine as base image. I tried that, the docker build worked fine but when I started the picus app it crashed when a REST API was called. The only output was Segmentation fault (core dumped) but there was no core dump. This seems to be in line with https://www.reddit.com/r/rust/comments/sq53vx/alpine_fails_to_run_my_app_what_steps_should_i/. So finally I gave up and used debian slim as runtime. What is actually the reason for small image sizes? If you want to reduce download capacity and disk usage it is best to use only a few base images because then the layers are reused. So I decided myself for debian:11.5-slim as one of a few base images to use such that it re-used among projects.

Regarding multi-stage build: as you can see in 0.2.0 I already used a multi-stage build. But when turning to multi-arch build (amd64 and arm64) in 0.3.0 I had to drop that, because the build time of an arm64 build on amd64 using qemu went up from about 3 minutes for amd64 build to more than 30 minutes for arm64 on qemu. So I finally decided for cross compilation instead.

Images from scratch definetly create smaller images and so woodpecker also uses images from scratch by default. But the drawback is that those are harder to debug. Once I had problem with woodpecker I could not start a shell in that container because there was no shell inside. So at least for the woodpecker agent I use alpine images now.

ljoonal commented 1 year ago

...it crashed when a REST API was called. ...

Oh, I see. I didn't realize that such an issue existed, though I do remember having to fight against some similarities myself before when dealing with compiling for musl ^^' The linked reddit discussion makes the segfault seem like an issue with OpenSSL... So if it hasn't been done yet could try to build reqwest with rustls instead 😅 Though I doubt it'd be that easy to fix then after all.

If you want to reduce download capacity and disk usage it is best to use only a few base images because then the layers are reused

Also yup, if only all of the containers I use used the same base layers, which tends to not be the case sadly, since I prefer using upstream docker images where possible instead of rolling my own variants. Alpine seems to be the most common base, beyond that I've seen multiple variants of Debian and Ubuntu and such, which isn't nice on cheap servers with only a few gigabytes of storage to begin with.

...because the build time of an arm64 build on amd64... more than 30 minutes So I finally decided for cross compilation instead.

Oooh, neat, I must admit I've not even dabbled with multi architecture support. Does cross compilation have limitations where it can't be used with multi-stage builds? Or is it more so just the segfault issue that'd need to be solved, and then the benefits of both worlds could be had? :D

Images from scratch... But the drawback is that those are harder to debug

Yup, which is why I also suggested that there could be a few variants of the images instead of just replacing it :)

windsource commented 1 year ago

Does cross compilation have limitations where it can't be used with multi-stage builds?

You probably can also use multi-stage builds for cross compilation but in my case it was just easier to build the target apps before and use buildx only for the final image. I wanted to have a multi-arch image with several architectures in one image.

...why I also suggested that there could be a few variants of the images instead of just replacing it :)

I might try to have an image from scratch as well for Picus. I have never done that for an Rust app before so I need to figure how to put all inside what I need (glibc, ssl certs, etc.)

windsource commented 1 year ago

Picus v0.3.1 now also creates distroless images which are much smaller (33.5 MB vs. 112MB before).