stackabletech / issues

This repository is only for issues that concern multiple repositories or don't fit into any specific repository
2 stars 0 forks source link

Fix broken/flaky docker image builds #595

Closed fhennig closed 4 months ago

fhennig commented 5 months ago

Related PRs:

Split CI per Product

To be able to run individual workflow for each product, we introduced various changes:

Distributed Docker Build Cache

The bake command from the image-tools-package package can configure docker build cache backends. The only economically viable backend for now is the registry backend.

Unfortunately this backend is broken for multi-stage images. See the discussion here.

This means we do not use full potential of the docker cache because only the intermediate images (and their layers) can be reused from cache, but not the layers of the final image. This is unfortunate because the final images are usually also the most expensive.

Proposal: spike splitting up multi-stage images into multiple single-stage images and test the caching impact on the build performance.

Meeting notes 1.7.2024:

Participants: on-prem

Meeting notes 20.6.2024:

Participants: @lfrancke , @dervoeti , @Maleware , @siegfriedweber , @razvan

Notes:

Not part of this issue, but related:

Look into potential helpers:

As discussed in the retro

to be refined.

### Tasks
- [ ] https://github.com/stackabletech/docker-images/pull/725
NickLarsenNZ commented 5 months ago

FYI, reusing workflows doesn't work too well with matrices (if you need to pass specific stuff between jobs). Not impossible, but will need a creative solution.

fhennig commented 5 months ago

We want to replace the existing workflow for dev images to only build images when necessary (when the corresponding paths have been changed).

I have investigated this before, and it is not straightforward to have mandatory checks that only need to run for changes in certain repositories. If a check is only running on certain changes, but it is also mandatory for the check to complete, the PR will be in an un-mergable state. Online a few people developed workarounds, but it's not pretty.

Just wanted to share this since I already looked into it before.