actions / cache

Cache dependencies and build outputs in GitHub Actions
MIT License
4.49k stars 1.2k forks source link

Missing cache in containerized job when using v3 #1300

Open reith opened 9 months ago

reith commented 9 months ago

I populate a cache in one job running on an ubuntu-latest machine and try to read the cache in another job that runs in an ubuntu container. The workflow fails when I use v3 but succeeds with v2.

v3 sample:

jobs:
  download-cloud-sql-proxy-binary:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/cache@v3
        id: check-cached
        with:
          path: cloud-sql-proxy
          key: cloud-sql-proxy-binary-z4
          lookup-only: true
      - name: downlaod cloud-sql-proxy binary
        if: ${{ steps.check-cached.outputs.cache-hit != 'true' }}
        run: |
          curl -o cloud-sql-proxy \
          https://storage.googleapis.com/cloud-sql-connectors/cloud-sql-proxy/v2.8.1/cloud-sql-proxy.linux.amd64; \
          chmod +x cloud-sql-proxy
      - name: cache cloud-sql-proxy binary
        if: ${{ steps.check-cached.outputs.cache-hit != 'true' }}
        uses: actions/cache@v3
        with:
          path: cloud-sql-proxy
          key: cloud-sql-proxy-binary-z4

  cache-user:
    runs-on: ubuntu-latest
    needs: download-cloud-sql-proxy-binary
    container:
      image: ubuntu
    steps:
      - uses: actions/cache@v3
        id: restore-cloud-sql-proxy-cache
        with:
          path: cloud-sql-proxy
          key: cloud-sql-proxy-binary-z4
          fail-on-cache-miss: true
      - run: false
        if: ${{ steps.restore-cloud-sql-proxy-cache.outputs.cache-hit != 'true' }}

The first job succeeds - while there is complain in logs - and the second job fails:

##[group]Run actions/cache@v3
with:
  path: cloud-sql-proxy
  key: cloud-sql-proxy-binary-z4
  lookup-only: true
  enableCrossOsArchive: false
  fail-on-cache-miss: false
##[endgroup]
Cache not found for input keys: cloud-sql-proxy-binary-z4
##[group]Run curl -o cloud-sql-proxy \
curl -o cloud-sql-proxy \
https://storage.googleapis.com/cloud-sql-connectors/cloud-sql-proxy/v2.8.1/cloud-sql-proxy.linux.amd64; \
chmod +x cloud-sql-proxy
shell: /usr/bin/bash -e {0}
##[endgroup]
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 29.7M  100 29.7M    0     0  51.4M      0 --:--:-- --:--:-- --:--:-- 51.4M
##[group]Run actions/cache@v3
with:
  path: cloud-sql-proxy
  key: cloud-sql-proxy-binary-z4
  enableCrossOsArchive: false
  fail-on-cache-miss: false
  lookup-only: false
##[endgroup]
Cache not found for input keys: cloud-sql-proxy-binary-z4
Post job cleanup.
[command]/usr/bin/tar --posix -cf cache.tzst --exclude cache.tzst -P -C REDACTED --files-from manifest.txt --use-compress-program zstdmt
Cache Size: ~13 MB (14090595 B)
Cache saved successfully
Cache saved with key: cloud-sql-proxy-binary-z4
Post job cleanup.
[command]/usr/bin/tar --posix -cf cache.tzst --exclude cache.tzst -P -C REDACTED --files-from manifest.txt --use-compress-program zstdmt
Failed to save: Unable to reserve cache with key cloud-sql-proxy-binary-z4, another job may be creating this cache. More details: Cache already exists. Scope: ..., Key: cloud-sql-proxy-binary-z4, Version: 6a7ca9a071c19fff6e5d89b66af6636146c508ff13e993982877000fad0361da
.... second job ...
4ae94e8fb967ac979402174a638c11a07b177e9016fb6b75e3682aca36e94ff3
##[command]/usr/bin/docker start 4ae94e8fb967ac979402174a638c11a07b177e9016fb6b75e3682aca36e94ff3
4ae94e8fb967ac979402174a638c11a07b177e9016fb6b75e3682aca36e94ff3
##[command]/usr/bin/docker ps --all --filter id=4ae94e8fb967ac979402174a638c11a07b177e9016fb6b75e3682aca36e94ff3 --filter status=running --no-trunc --format "{{.ID}} {{.Status}}"
4ae94e8fb967ac979402174a638c11a07b177e9016fb6b75e3682aca36e94ff3 Up Less than a second
##[command]/usr/bin/docker inspect --format "{{range .Config.Env}}{{println .}}{{end}}" 4ae94e8fb967ac979402174a638c11a07b177e9016fb6b75e3682aca36e94ff3
HOME=/github/home
GITHUB_ACTIONS=true
CI=true
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
##[endgroup]
##[group]Waiting for all services to be ready
##[endgroup]
##[group]Run actions/cache@v3
with:
  path: cloud-sql-proxy
  key: cloud-sql-proxy-binary-z4
  fail-on-cache-miss: true
  enableCrossOsArchive: false
  lookup-only: false
##[endgroup]
##[command]/usr/bin/docker exec  4ae94e8fb967ac979402174a638c11a07b177e9016fb6b75e3682aca36e94ff3 sh -c "cat /etc/*release | grep ^ID"
##[error]Failed to restore cache entry. Exiting as fail-on-cache-miss is set. Input key: cloud-sql-proxy-binary-z4
##[group]Run false

Also, while fail-on-cache-miss is set, it actually doesn't terminate the job and I need another step, false here, to terminate the job.

v2 sample, which works:

jobs:
  download-cloud-sql-proxy-binary:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/cache@v2
        id: check-cached
        with:
          path: cloud-sql-proxy
          key: cloud-sql-proxy-binary-z5
      - name: downlaod cloud-sql-proxy binary
        if: ${{ steps.check-cached.outputs.cache-hit != 'true' }}
        run: |
          curl -o cloud-sql-proxy \
          https://storage.googleapis.com/cloud-sql-connectors/cloud-sql-proxy/v2.8.1/cloud-sql-proxy.linux.amd64; \
          chmod +x cloud-sql-proxy
      - name: cache cloud-sql-proxy binary
        if: ${{ steps.check-cached.outputs.cache-hit != 'true' }}
        uses: actions/cache@v2
        with:
          path: cloud-sql-proxy
          key: cloud-sql-proxy-binary-z5

  cache-user:
    runs-on: ubuntu-latest
    needs: download-cloud-sql-proxy-binary
    container:
      image: ubuntu
    steps:
      - uses: actions/cache@v2
        id: restore-cloud-sql-proxy-cache
        with:
          path: cloud-sql-proxy
          key: cloud-sql-proxy-binary-z5
      - run: false
        if: ${{ steps.restore-cloud-sql-proxy-cache.outputs.cache-hit != 'true' }}

Logs:

Download action repository 'actions/cache@v2' (SHA:8492260343ad570701412c2f464a5877dc76bace)
Complete job name: download-cloud-sql-proxy-binary
##[group]Run actions/cache@v2
with:
  path: cloud-sql-proxy
  key: cloud-sql-proxy-binary-z5
##[endgroup]
Cache not found for input keys: cloud-sql-proxy-binary-z5
##[group]Run curl -o cloud-sql-proxy \
curl -o cloud-sql-proxy \
https://storage.googleapis.com/cloud-sql-connectors/cloud-sql-proxy/v2.8.1/cloud-sql-proxy.linux.amd64; \
chmod +x cloud-sql-proxy
shell: /usr/bin/bash -e {0}
##[endgroup]
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 29.7M  100 29.7M    0     0  61.0M      0 --:--:-- --:--:-- --:--:-- 61.1M
##[group]Run actions/cache@v2
with:
  path: cloud-sql-proxy
  key: cloud-sql-proxy-binary-z5
##[endgroup]
Cache not found for input keys: cloud-sql-proxy-binary-z5
Post job cleanup.
[command]/usr/bin/tar --posix -z -cf cache.tgz -P -C REDACTED --files-from manifest.txt
Cache Size: ~13 MB (14149872 B)
Cache saved successfully
Cache saved with key: cloud-sql-proxy-binary-z5
Post job cleanup.
Unable to reserve cache with key cloud-sql-proxy-binary-z5, another job may be creating this cache.
Cleaning up orphan processes
... second job ...
##[command]/usr/bin/docker inspect --format "{{range .Config.Env}}{{println .}}{{end}}" 19186edeef8448737d77626e677cf4455144202e47dd3d21d9e611a07cb62bcc
HOME=/github/home
GITHUB_ACTIONS=true
CI=true
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
##[endgroup]
##[group]Waiting for all services to be ready
##[endgroup]
##[group]Run actions/cache@v2
with:
  path: cloud-sql-proxy
  key: cloud-sql-proxy-binary-z5
##[endgroup]
##[command]/usr/bin/docker exec  19186edeef8448737d77626e677cf4455144202e47dd3d21d9e611a07cb62bcc sh -c "cat /etc/*release | grep ^ID"
Received 14149872 of 14149872 (100.0%), 67.5 MBs/sec
Cache Size: ~13 MB (14149872 B)
[command]/usr/bin/tar -z -xf /__w/_temp/a29e895e-8ae0-4fd0-a912-58dafc3b472d/cache.tgz -P -C REDACTED
Cache restored successfully
Cache restored from key: cloud-sql-proxy-binary-z5
Post job cleanup.
##[command]/usr/bin/docker exec  19186edeef8448737d77626e677cf4455144202e47dd3d21d9e611a07cb62bcc sh -c "cat /etc/*release | grep ^ID"
Cache hit occurred on the primary key cloud-sql-proxy-binary-z5, not saving cache.
Stop and remove container: 429cc935fa784d5b91e62561ae38fc9a_ubuntu_974c72
##[command]/usr/bin/docker rm --force 19186edeef8448737d77626e677cf4455144202e47dd3d21d9e611a07cb62bcc
19186edeef8448737d77626e677cf4455144202e47dd3d21d9e611a07cb62bcc
Remove container network: github_network_b42726fea1be4e09909a1c53a335e8f2
##[command]/usr/bin/docker network rm github_network_b42726fea1be4e09909a1c53a335e8f2
github_network_b42726fea1be4e09909a1c53a335e8f2
Cleaning up orphan processes
bitjson commented 8 months ago

Running into the same issue, though downgrading to v2 doesn't seem to resolve it in my repo.

This comment helped me realize that the resolved path is different between the two jobs, even though I use the same relative path in both configurations.

In my case, the non-containerized job uses a working directory of /home/runner/work/repo-name/repo-name, while the containerized job uses a working directory of /__w/repo-name/repo-name. (Maybe related: https://github.com/actions/runner/issues/2058)

It would be nice if actions/cache could use the relative path in selecting matches rather than the resolved absolute path.

For now, I'm working around the issue by just creating the duplicate caches, but I suppose you could also work around the issue by copying the cached files to a common absolute path and/or using symlinks.

Julio-Guerra commented 7 months ago

Just stumbled upon this problem with v4 too. Any idea what might be missing in the container for caching to work? The container image we use is golang.

ruffsl commented 5 months ago

Bumped into this issue yesterday, but luckily remembered reading a related ticket that detailed the nuances of cache restoration. TD;LR: Ensure sufficient/matching compression tools are available from both container and non-container runtimes: