Open bradlarsen opened 3 months ago
The caching also doesn't seem to be working as one would hope within the Docker image builder workflows. There, however, you have an additional caching mechanism to worry about beyond just GitHub Actions cache: the Docker cache.
I believe things are plumbed properly so that Docker will attempt to use GitHub Actions' cache, but the underlying caching problem reported above is surely impacting that as well.
In #164, I've improved the caching situation in regular GitHub Actions some, giving a 30-50% reduction in build times when a cache hit occurs.
To fix the caching problems in Docker, more work is involved. See here for some ideas: https://gist.github.com/noelbundick/6922d26667616e2ba5c3aff59f0824cd
Note for Docker caching problems in CI: It looks like building the Debian-based Docker images does apt install
every time instead of ever caching that layer, which invalidates everything following.
The GitHub Cache limit might be part of the problem. I sorted all cache entries by least recently used, and all entries were used within the last day/two (even entries month old). Since we hit the limit, some caching entries are getting removed, reducing their usefulness (I think, but I'm not entirely sure).
I recently split out Vectorscan into its own published crate. Now in the CI jobs, that long-building package can be cached, and cached build job times are all under 2 minutes. For example:
https://github.com/praetorian-inc/noseyparker/actions/runs/8594814073
Relevant discussion I saw over the weekend about caching and Docker builds in GitHub Actions: https://news.ycombinator.com/item?id=39956327. It sounds like the caching scheme we are currently using (GitHub Actions cache for Docker) is doomed, as the cache is too small for the images we are building.
One possible alternative is to use the registry
cache mode for the builder job. This will use an external location (like the GHCR registry) for caching, instead of GitHub Actions. Needs investigation.
I experimented a bunch in #169, but failed to find a GitHub Actions incantation that resulted in Docker image builds that actually effectively cached the build stages.
Caching works better now for regular CI jobs, but for jobs that build Docker images, the caching mostly doesn't work at all.
Going back to the matter I brought up earlier, I still think the cache-from
directive should be added there. It doesn't create any additional cache but only could use the cache generated by this part. Seems like a small possibility of improvement with no drawbacks.
The other problem I see with changing Docker cache mode to min
is that it won't be very effective. Per official docs:
In
min
cache mode (the default), only layers that are exported into the resulting image are cached
The current Dockerfiles setup will cache only apt-get
actions, as the final image is the runner image with the copied binary from the builder phase. To actually leverage caching, I see two options:
min
and split builder
and runner
Dockerfile stages (at least for Debian, Alpine needs to be separated into more stages first) into separate files so each stage is the "final stage." This is a small improvement, but as long as dependencies won't change, it should shave off quite a lot of CI time.max
and figure out the caching (probably using the registry).Going back to the https://github.com/praetorian-inc/noseyparker/pull/169#discussion_r1557906813, I still think the cache-from directive should be added there. It doesn't create any additional cache but only could use the cache generated by this part. Seems like a small possibility of improvement with no drawbacks.
Oh, I understand what you're saying about that now. Read from the cache if possible, but don't write to it in release builds. I suspect that it's not going to do anything today in terms of cache benefit. But worth a try.
Describe the bug The GitHub Actions workflows for Nosey Parker don't seem to make effective use of the Actions cache. This causes workflows to run much slower than they might otherwise.
To Reproduce For example, take a successful CI run, and then rerun it. Examine the details of the actions log. Additionally, check the build timing artifacts that are produced by the CI jobs; these give more details as to how many dependencies are being rebuilt by
cargo
and how long they are taking.Expected behavior You would expect that the jobs would complete quickly, since nothing has changed and the cache has been populated.
Actual behavior The jobs build nearly everything from scratch, even when cache entries are found.