julia-actions / cache

A shortcut action to cache Julia artifacts, packages, and registries.
MIT License
38 stars 8 forks source link

Update cache every run. Add `/compiled` and `/logs`. Make key sensitive to matrix. #71

Closed IanButterworth closed 9 months ago

IanButterworth commented 9 months ago

Alternative to https://github.com/julia-actions/cache/pull/45 Closes https://github.com/julia-actions/cache/issues/11 Fixes https://github.com/julia-actions/cache/issues/46

This keeps the depot in one cache file for the given key, which is sensitive to ~julia version, OS and arch (for the usual matrix var names, and there's a note if not).~ all matrix vars and the runner.OS. It also re-saves the cache every run, removing the previous cache that was used. By including /logs it also enables automatic Pkg.gc() to keep the depot size down, given that the depot will be restored>used>saved>restored>used>saved.. sequentially through the runs.

Regarding the split strategy in #45 , while there are some elements of the depot that are cross-OS compatible (/packages, /registries), for precompilation to remain hot, and for automatic Pkg.gc() to work properly, I think the depot should be cached in lock-step as a whole.

To implement this, because neither Github or actions/cache expose a way to update a cache file at a key, we have to run our own deletion just before the actions/cache save post step. So for a given key, there will only ever be one cache retained. This reduces the chance of the automatic eviction github does evicting a useful cache over an less useful one.

Demo

You can see it in action here https://github.com/JuliaIO/PNGFiles.jl/pull/72 Tests have been disabled, so it's just installation and loading.

First commit was with no cache files in the repo. Second empty commit benefitted from the caching.

On linux 1.11 goes from 2 minutes to 40s

Cache sizes are all around 30 MB.

And there will only ever be the same number as the permutations of the matrix, so 15 in that repo, which is a common number.

I will note that it seems there's been a sizeable regression in precompile duration in master recently. Needs investigating.

How to release

It'd be nice to get this widely adopted, so I'm intrigued about releasing this as a minor change. There's no breakage to the API, except for changes to defaults.

IanButterworth commented 9 months ago

isn't the most recent cache deleted?

No, the cache deletion step happens just before the cache/save step.

The way post jobs are queued is a little confusing.

IanButterworth commented 9 months ago

Ideally we'd only save if there were changes in the depot, but in practice that will be true every run because of the usage logs being updated. And usage logs allow us to do auto Pkg.gc

It kind of puts the decision on how to manage the depot in Julia's hands, which I think makes sense.

IanButterworth commented 9 months ago

Reminder to myself. We might want to add an input to turn off cache deletion. (Done)

IanButterworth commented 9 months ago

If you're able to review @SaschaMann let me know, otherwise I'll merge and release this as v1.4.0 tomorrow. (or let me know if you want more time)

IanButterworth commented 9 months ago

I've done some tidying. @rikhuijzer would you mind taking one last look before I merge? This is in the context of it being a non-breaking release

IanButterworth commented 9 months ago

Ok. I've found a better way than guessing matrix var names.. now this collates all matrix vars in the order they are entered automatically into a string.

So

      matrix:
        julia-version: ['1', '^1.10.0-rc1', 'nightly']
        os: [ubuntu-latest, windows-latest, macOS-latest]
        arch: [x64, x86]

becomes, for instance 1-ubuntu-latest-x64 And I included runner.OS just in case it's not part of the matrix.

So a full default key looks like julia-cache-Linux-1-ubuntu-latest-x64-6961372938-1

You can see the latest in action here https://github.com/JuliaIO/PNGFiles.jl/actions/runs/6961372938/attempts/1?pr=72 and the re-run with a hot cache (although some of the nightlies updated) https://github.com/JuliaIO/PNGFiles.jl/actions/runs/6961372938/attempts/2?pr=72

rikhuijzer commented 9 months ago

For future reference, here are the numbers from the two runs:

| Test Description | First Run | Second Run | |------------------|----------------|----------------| | test (nightly, macOS-latest) | 2m 56s | 3m 38s | | test (nightly, ubuntu-latest) | 2m 1s | 38s | | test (nightly, ubuntu-latest, ...) | 2m 0s | 1m 54s | | test (nightly, windows-latest, ...) | 3m 5s | 1m 36s | | test (nightly, windows-latest, ...) | 3m 4s | 1m 29s | | test (^1.10.0-rc1, macOS-latest, ...) | 1m 59s | 2m 0s | | test (^1.10.0-rc1, ubuntu-latest, ...) | 1m 38s | 1m 36s | | test (^1.10.0-rc1, ubuntu-latest, ...) | 1m 35s | 1m 28s | | test (^1.10.0-rc1, windows-latest, ...) | 2m 39s | 1m 19s | | test (^1.10.0-rc1, windows-latest, ...) | 2m 35s | 1m 8s | | test (1, macOS-latest, ...) | 3m 22s | 1m 20s | | test (1, ubuntu-latest, ...) | 1m 35s | 37s | | test (1, ubuntu-latest, ...) | 1m 32s | 40s | | test (1, windows-latest, ...) | 2m 39s | 1m 2s | | test (1, windows-latest, ...) | 2m 51s | 1m 21s | (Credits to ChatGPT. I just threw in two screenshots asked for a table. I manually verified all the numbers. They are correct.)

Nice speedup; especially on the latest stable release (Julia version '1') 🚀

IanButterworth commented 9 months ago

Note that some of the nightly versions changed between runs, so had to do re-precompilation.. That will be an issue until 1.11 is released.

But yeah, it's surprisingly effective on 1.9, but unfortunately that gain is lost in 1.10.0-rc1 because of more stdlibs outside of the sysimage. I'm going to look into whether we can fix that regression by moving away from mtime sensitivity when it comes to stdlibs. I.e. reviving https://github.com/JuliaLang/julia/pull/50919

IanButterworth commented 9 months ago

Ok. Should be good to go now. Final run check

cold cache hot cache
https://github.com/JuliaIO/PNGFiles.jl/actions/runs/6961978935/attempts/3?pr=72 https://github.com/JuliaIO/PNGFiles.jl/actions/runs/6961978935/attempts/5?pr=72
Screenshot 2023-11-22 at 2 56 43 PM Screenshot 2023-11-22 at 3 58 37 PM
IanButterworth commented 9 months ago

I think we may want to also include the job name in the key automatically?

IanButterworth commented 9 months ago

Also from testing this with a self hosted runner. We can't assume jq is installed

IanButterworth commented 9 months ago

Also, if there's no matrix

jq: error (at <stdin>:1): string ("") has no keys
SaschaMann commented 9 months ago

Also from testing this with a self hosted runner. We can't assume jq is installed

I'm not too worried about that. Requiring jq as a dependency seems fine, it's an extremely common tool and it's available on all hosted runners anyway.

IanButterworth commented 9 months ago

Note to self. Hitting this on a private repo during the post step, which doesn't currently make sense to me..

Setting working directory to /home/runner/work/Foo.jl/Foo.jl
failed to run git: fatal: not a git repository (or any of the parent directories): .git

ERROR: LoadError: failed process: Process(`gh cache list --limit 100`, ProcessExited(1)) [1]
IanButterworth commented 9 months ago

I'm noticing some things testing this with a larger less standard package setup, so I'm investigating.

IanButterworth commented 9 months ago

Ok, some tweaks:

IanButterworth commented 9 months ago

ugh.. turns out you can just ${{ join(matrix.*, '-') }} to get all matrix values joined with -.

So that avoids the need for jq 🎉

I'll go ahead and merge, and release tomorrow as 1.4.0 unless I hear otherwise.