julia-actions / cache

A shortcut action to cache Julia artifacts, packages, and registries.
MIT License
38 stars 9 forks source link

Delete cache entries on the workflow branch #95

Closed omus closed 10 months ago

omus commented 10 months ago

I've noticed that the post-action step which automatically cleans up old cache entries accidentally cleans up entries created on other branches. This reduces performance with multiple PRs are being developed concurrently especially in the scenario where a PR deletes the cache entry for the main branch resulting in all new PRs having cache misses until a PR has been merged.

This PR updates the handle_caches.jl script to only delete caches which match the restore key using the same git ref as the executing workflow. This should result in workflows run on the main branch only deleting cache entries on main and PRs only deleting cache entries for that PR.

For more details about cache isolation in regards to branches see: https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#restrictions-for-accessing-a-cache

IanButterworth commented 10 months ago

Ah yeah. Sounds good.

Should we also clean up if the branch no longer exists? or does gh do that automatically?

omus commented 10 months ago

Seems like I may need to do some additional work here as no deletion occurred during the restore post step. Could I get write access to this repo so I can view the cache entries for this repo?

IanButterworth commented 10 months ago

granted!

omus commented 10 months ago

The addition of delete-old-caches: required makes it so the post step will fail if any errors occurred. This is useful for catching with this code as part of our CI

omus commented 10 months ago

Failure I'm seeing:

 X Failed to delete cache: HTTP 403: Resource not accessible by integration (https://api.github.com/repos/julia-actions/cache/actions/caches/833)
┌ Error: ProcessFailedException(Base.Process[Process(`gh cache delete 833 --repo julia-actions/cache`, ProcessExited(1))])
Failed to delete 1 existing caches on ref `refs/pull/95/merge` matching restore key `f692b33bcf976173a90399f75e503f5069beffac7a1433ff6d0260fea3a39b0a;os=macOS;dep-name=pandoc_jll;dep-version=3;dep-invalid-chars=%2C;os=macOS-latest;`
└ @ Main ~/work/cache/cache/handle_caches.jl:33

The gh cache delete does support deletion by ID however the [GitHub API delete] uses key and ref and doesn't support ID. I'll try using the API call. I'll note that the gh cache delete does work locally for me.

omus commented 10 months ago

Got a different error message with key and ref:

{"message":"Resource not accessible by integration","documentation_url":"https://docs.github.com/rest/actions/cache#delete-github-actions-caches-for-a-repository-using-a-cache-key"}

Something extra odd is that a save step failed which is weird as the cache entry should be unique for each commit.

Also, I missed the API entry for deleting by ID. Will try that too.

omus commented 10 months ago

Okay one of the issues is that the non-matrix job attempt to delete the cache from the matrix job. These are the IDs it was trying to delete:

{
  "created_at": "2024-01-12T20:18:19.553333300Z",
  "id": 858,
  "key": "0f738e3aa54b6c565d0653f7f6c693f0e2ee7daf4df156cbd438d0dddd3f59ce;os=Linux;dep-name=pandoc_jll;dep-version=3;dep-invalid-chars=%2C;os=ubuntu-latest;run_id=7507024977;run_attempt=1",
  "last_accessed_at": "2024-01-12T20:18:19.553333300Z",
  "ref": "refs/pull/95/merge",
  "size_in_bytes": 33632865,
  "version": "d77c1c90e00a64a184458d2436d9e29a8827ebefbf0e808722e5319060bd6e95"
}
{
  "created_at": "2024-01-12T20:18:06.460000000Z",
  "id": 857,
  "key": "0f738e3aa54b6c565d0653f7f6c693f0e2ee7daf4df156cbd438d0dddd3f59ce;os=Linux;run_id=7507024977;run_attempt=1",
  "last_accessed_at": "2024-01-12T20:18:06.460000000Z",
  "ref": "refs/pull/95/merge",
  "size_in_bytes": 33632694,
  "version": "3fb0ac1e903d37e8b7c1b4bfa95ee72568fbf59e290bc6a968260251aa12fb2a"
}

This is strictly a test problem so I'll change it so the matrix/non-matrix jobs use different cache-names

omus commented 10 months ago

Should we also clean up if the branch no longer exists? or does gh do that automatically?

I do not think GitHub does it automatically but I don't think we should try to automatically clean up other branches as that would require listing all available caches and probably result in hitting throttling limits. I think we should just leave those caches alone as if the branch is unused they won't be referenced again and will automatically be evicted once the cache room is full and they are the oldest cache entries.

One alternative would be to recommend having another workflow like this which deletes the cache entries when the PR is closed: https://github.com/actions/cache/blob/main/tips-and-workarounds.md#force-deletion-of-caches-overriding-default-cache-eviction-policy

omus commented 10 months ago

Ah, the workflow problem I'm having is due to the workflows existing in a fork

omus commented 10 months ago

I spend way too much time on investigating why the post cache cleanup was failing here. As it turns out the actions: write permission is set to actions: read for PR forks. I may end up making a new PR and close this one to preserve the work done here but make the PR people review more reasonable.

omus commented 10 months ago

Superseded by #97