julia-actions / cache

A shortcut action to cache Julia artifacts, packages, and registries.
MIT License
38 stars 9 forks source link

Enable cross-OS caching for generic dirs. Enable OS-specific cache-compiled by default. #45

Closed SaschaMann closed 1 year ago

SaschaMann commented 1 year ago

See https://github.com/actions/cache/pull/1056.

Fixes #44


As pointed out in the upstream actions, symlinks may lead to breakage. I don't think that affects the cached directories, does it?

I moved the compiled caching to its own steps, that way we enable people to benefit from cross-OS caching where possible while still caching compiled.


Existing caches will no longer be restored due to the name change, so this may cause a wave of cache misses once released.

SaschaMann commented 1 year ago

@giordano any idea what could be causing the failure? Does windows do something weird with the artifacts dir?

giordano commented 1 year ago

No, I don't think so :confused:

SaschaMann commented 1 year ago

I'll change the CI to test all combination of where a cache was created vs where it's being restored. Maybe that will make it clearer where the error comes from.

IanButterworth commented 1 year ago

Worth re-running CI? Logs have expired

giordano commented 1 year ago

There's a merge conflict now

fatteneder commented 1 year ago

There's a merge conflict now

I don't understand much about github ci but maybe this patch resolves the conflicts?

diff --cc action.yml
index 65ad5e6,d919963..0000000
--- a/action.yml
+++ b/action.yml
@@@ -34,25 -37,41 +37,43 @@@ runs
      - id: paths
        run: |
          [ "${{ inputs.cache-artifacts }}" = "true" ] && A_PATH="~/.julia/artifacts"
 -        echo "ARTIFACTS_PATH=$A_PATH" >> $GITHUB_ENV
 +        echo "artifacts-path=$A_PATH" >> $GITHUB_OUTPUT
          [ "${{ inputs.cache-packages }}" = "true" ] && P_PATH="~/.julia/packages"
 -        echo "PACKAGES_PATH=$P_PATH" >> $GITHUB_ENV
 +        echo "packages-path=$P_PATH" >> $GITHUB_OUTPUT
          [ "${{ inputs.cache-registries }}" = "true" ] && R_PATH="~/.julia/registries"
 -        echo "REGISTRIES_PATH=$R_PATH" >> $GITHUB_ENV
 +        echo "registries-path=$R_PATH" >> $GITHUB_OUTPUT
-         [ "${{ inputs.cache-compiled }}" = "true" ] && PCC_PATH="~/.julia/compiled"
-         echo "precompilation-cache-path=$PCC_PATH" >> $GITHUB_OUTPUT
        shell: bash

 -    - uses: actions/cache@58c146cc91c5b9e778e71775dfe9bf1442ad9a12
 +    - uses: actions/cache@704facf57e6136b1bc63b828d79edcd491f0ee84
        id: cache
        with:
-         path: "${{ format('{0}\n{1}\n{2}\n{3}', steps.paths.outputs.artifacts-path, steps.paths.outputs.packages-path, steps.paths.outputs.registries-path, steps.paths.outputs.precompilation-cache-path) }}"
-         key: ${{ runner.os }}-${{ inputs.cache-name }}-${{ hashFiles('**/Project.toml') }}
 -        path: "${{ format('{0}\n{1}\n{2}', env.ARTIFACTS_PATH, env.PACKAGES_PATH, env.REGISTRIES_PATH) }}"
++        path: "${{ format('{0}\n{1}\n{2}', steps.paths.outputs.artifacts-path, steps.paths.outputs.packages-path, steps.paths.outputs.registries-path) }}"
+         key: ${{ inputs.cache-name }}-${{ hashFiles('**/Project.toml') }}
          restore-keys: |
-           ${{ runner.os }}-${{ inputs.cache-name }}-
+           ${{ inputs.cache-name }}-
+         enableCrossOsArchive: true

      - id: hit
        run: echo "cache-hit=$CACHE_HIT" >> $GITHUB_OUTPUT
        env:
          CACHE_HIT: ${{ steps.cache.outputs.cache-hit }}
        shell: bash
+ 
++    # or should this be (to align with above)
++    # - uses: actions/cache@704facf57e6136b1bc63b828d79edcd491f0ee84
+     - uses: actions/cache@58c146cc91c5b9e778e71775dfe9bf1442ad9a12
+       id: cache-compiled
+       if: inputs.cache-compiled == 'true'
+       with:
+         path: ~/.julia/compiled
+         key: ${{ inputs.cache-name }}-compiled-${{ hashFiles('**/Project.toml') }}
+         restore-keys: |
+           ${{ inputs.cache-name }}-compiled-
+         enableCrossOsArchive: false
+ 
+     - id: compiled-hit
+       if: inputs.cache-compiled == 'true'
+       run: echo "cache-compiled-hit=$CACHE_HIT" >> $GITHUB_OUTPUT
+       env:
+         CACHE_HIT: ${{ steps.cache-compiled.outputs.cache-hit }}
+       shell: bash
IanButterworth commented 1 year ago

@SaschaMann let me know if you want help moving this forward (resolving the merge conflict)

IanButterworth commented 1 year ago

Hope you don't mind me going ahead. I want to make progress on https://github.com/JuliaLang/julia/issues/50667

SaschaMann commented 1 year ago

@IanButterworth feel free to take over. I don't have the capacity to work on this atm unfortunately.

IanButterworth commented 1 year ago

I'm making the proposal to enable cache-compiled by default because:

SaschaMann commented 1 year ago

Happy for this to be merged if you all agree it's fine and I agree with Rik that it's hard to fully test this beforehand. However, if the CI failures are correct, they should be addressed. If not, maybe the CI needs fixing or removal.

Thanks for pushing this forward!

IanButterworth commented 1 year ago

Once https://github.com/JuliaLang/julia/pull/52123 is merged my plan is to finish this up and get it released before 1.11 is out so we can make faster CI land with 1.11 nicely

IanButterworth commented 1 year ago

I've been testing now that https://github.com/JuliaLang/julia/pull/52123 is merged and I'm seeing big speed ups in cached runs with no precompilation.

However, two thoughts:

  1. I think it might not make sense to separate the caching of /compiled an other depot dirs, as they are all kind of interlinked.
  2. Currently the compiled cache key isn't correct because it doesn't save the cache if any new precompilation has been done without changes to the project, which is possible if the registry has changed. Perhaps we should just always re-save it? If that's possible? Or is there something else that would be fast to hash that would be sensitive and specific-enough?
fatteneder commented 1 year ago

Or is there something else that would be fast to hash that would be sensitive and specific-enough?

What about Manifest.toml? Is it possible to run a small Julia script in the inputs section, which resolves all Pkg versions. That script could perhaps also generate a custom hash that accounts for arch, OS, Manifest, etc.

IanButterworth commented 1 year ago

That won't be sensitive to test dep version changes as the test env is a temp dir merge.

IanButterworth commented 1 year ago

I'm a bit stumped in trying to get hashFiles to hash the compiled dir.

One issue is that it will manually ignore any files outside of the github workspace, even if they're returned by glob. However the 2nd arg of hashFiles currentWorkspace allows you to set a custom workspace during the check.

But I've set this up in a way I think should work and two issues:

  1. The glob is returning all files not just .ji files
  2. They are all being rejected for being outside of the current workspace.. (note that the log is misleading and makes it look like it is specifically controlled by GITHUB_WORKSPACE, but the currentWorkspace arg overrides this. Fixed in my PR)
##[debug]/home/runner/.julia/compiled/v1.11/Reexport
##[debug]Ignore '/home/runner/.julia/compiled/v1.11/Reexport' since it is not under GITHUB_WORKSPACE.
##[debug]/home/runner/.julia/compiled/v1.11/Reexport/bTpYr_AlfsU.ji
##[debug]Ignore '/home/runner/.julia/compiled/v1.11/Reexport/bTpYr_AlfsU.ji' since it is not under GITHUB_WORKSPACE.
##[debug]/home/runner/.julia/compiled/v1.11/Reexport/bTpYr_AlfsU.so
##[debug]Ignore '/home/runner/.julia/compiled/v1.11/Reexport/bTpYr_AlfsU.so' since it is not under GITHUB_WORKSPACE.
##[debug]/home/runner/.julia/compiled/v1.11/Reexport/bTpYr_sbGYn.ji
##[debug]Ignore '/home/runner/.julia/compiled/v1.11/Reexport/bTpYr_sbGYn.ji' since it is not under GITHUB_WORKSPACE.
##[debug]/home/runner/.julia/compiled/v1.11/Reexport/bTpYr_sbGYn.so
##[debug]Ignore '/home/runner/.julia/compiled/v1.11/Reexport/bTpYr_sbGYn.so' since it is not under GITHUB_WORKSPACE.
DilumAluthge commented 1 year ago

What's the benefit of cross-os caching? Precompile cachefiles (.ji and .so/.dylib/.jll files) are not portable across different operating systems.

IanButterworth commented 1 year ago

@DilumAluthge PR title was confusing. Updated

IanButterworth commented 1 year ago

Though I am starting to wonder whether it's just better to just maintain an OS-specific single cache of the depot.. so that everything is kept in lock-step

IanButterworth commented 1 year ago

After much debugging yesterday it turns out that the cache key is generated in the pre step, so we cannot make it depend on the state of the files being cached (unless https://github.com/actions/cache/pull/673 is added).

That means we can't be smart about when to save a cache state, as we cannot rely on anything in the repo alone to inform that.

One strategy is to always save at the end, with the same cache key, but to do that we have to manually delete the old cache before saving it. There's no update mode in the API. It might require more user auth.

IanButterworth commented 1 year ago

I also think the enableCrossOsArchive: true thing won't work for us because basically always in CI the jobs are run concurrently, meaning the caches from different OS runs won't be able to sequentially coalesce content into the same cache file most of the time.

I'm thinking about opening another PR that:

Though this model won't be ideal for cases where multiple jobs run concurrently on the same julia version & OS. That needs some more thought.

IanButterworth commented 1 year ago

Check out https://github.com/julia-actions/cache/pull/71. I think it's a more sensible approach

IanButterworth commented 1 year ago

Now that #71 is merged I think this can be closed