falcondev-oss / github-actions-cache-server

Self-hosted GitHub Actions cache server implementation. Compatible with official 'actions/cache' action
https://gha-cache-server.falcondev.io
MIT License
92 stars 4 forks source link

Fails with docker action #41

Closed joh-klein closed 2 months ago

joh-klein commented 2 months ago

I deployed the cache server right along my runners with kubernetes. The npm jobs worked perfectly. Both uploading and downloading. But when it came to the docker/build-push-action@v5 step, cache download and upload failed.

Here is the relevant step:

- name: Build and push
  uses: docker/build-push-action@v5
  with:
    context: .
    push: true
    tags: |
      tag1
      tag2
    cache-from: type=gha
    cache-to: type=gha,mode=max
    file: apps/${{ matrix.app }}/docker/Dockerfile
    provenance: false
    sbom: false

The error (downloading the cache):

#4 importing cache manifest from gha:6098406064130472668
#4 ERROR: Get "http://cache-server:3000/---token---/_apis/artifactcache/cache?keys=index-buildkit-1-f921bd05&version=693bb7016429d80366022f036f84856888c9f13e00145f5f6f4dce303a38d6f2": dial tcp: lookup cache-server on 46.38.225.xxx:53: no such host

The other error (uploading the cache):

ERROR: failed to solve: Get "http://cache-server:3000/---token---/_apis/artifactcache/cache?keys=buildkit-blob-1-sha256%3A067b0579ac4a7184ba4d6bd836719a8fa7286f4d20291d4f10a0b8c85ffca4c9&version=693bb7016429d80366022f036f84856888c9f13e00145f5f6f4dce303a38d6f2": dial tcp: lookup cache-server on 46.38.225.xxx:53: no such host

No idea, where the ip address 46.38.225.xxx comes from. Its not one of mine (but it is from the hoster I am using).

I have three physical server in my cluster (with 16 runner pods). But in the other step they were perfectly able to connect to the cache server and use it.

LouisHaftmann commented 2 months ago

Looks like the runner cannot connect to the cache server. Are you sure it works with other actions? There should be no difference between npm and docker jobs (if they use the official actions/cache action)

LouisHaftmann commented 2 months ago

Try enabling debug logs by setting DEBUG=true on the cache server.

joh-klein commented 2 months ago

I have a feeling, that they are not 100% using the official action but something close to it if I understand the docs correctly.

Logging shows it working perfectly:

[cache-server] ⚙ Get: Getting cache entry for [ 'npm-deps-01aee79ff25db681a8d8a57efc508e7c9fa80270155c8b6c1c46a5e191988xxx',
  'npm-deps-' ] 53079b9d382d510d013ca6baced3bd709bb6f2b337b4a15bc5c7899c5fb2dxxx
[cache-server] ⚙ Finding key match { key: 'npm-deps-01aee79ff25db681a8d8a57efc508e7c9fa80270155c8b6c1c46a5e191988xxx',
  version: '53079b9d382d510d013ca6baced3bd709bb6f2b337b4a15bc5c7899c5fb2dxxx',
  restoreKeys: [ 'npm-deps-' ] }
[cache-server] ⚙ Get: Cache entry found for [ 'npm-deps-01aee79ff25db681a8d8a57efc508e7c9fa80270155c8b6c1c46a5e191988xxx',
  'npm-deps-' ] 53079b9d382d510d013ca6baced3bd709bb6f2b337b4a15bc5c7899c5fb2dxxx with id npm-deps-01aee79ff25db681a8d8a57efc508e7c9fa80270155c8b6c1c46a5e191988xxx
[cache-server] ⚙ Download: Downloading npm-deps-01aee79ff25db681a8d8a57efc508e7c9fa80270155c8b6c1c46a5e191988xxx-53079b9d382d510d013ca6baced3bd709bb6f2b337b4a15bc5c7899c5fb2dxxx
LouisHaftmann commented 2 months ago

If there are no logs for the docker cache, it probably cannot even reach the cache server. dial tcp: lookup cache-server on 46.38.225.xxx:53: no such host would also suggest that the networking in kubernetes is misconfigured.

joh-klein commented 2 months ago

I am not a k8s expert, but my thoughts were – if the lookup works for npm it should work for everything. How can I make sure, that it works correctly?

LouisHaftmann commented 2 months ago

Docker probably uses the wrong DNS server (46.38.225.xxx:53) so it cannot find the ip for hostname cache-server. Is 46.38.225.xxx:53 one of your servers?

joh-klein commented 2 months ago

No, not one of the 3 servers in my k8s cluster. But it definitely belongs to the hoster.

LouisHaftmann commented 2 months ago

Docker might use a different DNS, not the Kubernetes DNS. Since this is probably a Kubernetes configuration issue, I'll close this issue for now. Feel free to reopen later!

joh-klein commented 2 months ago

I just did a nslookup cache-server in one of my containers and that worked.

LouisHaftmann commented 2 months ago

The docker build process probably uses a different dns server

joh-klein commented 2 months ago

That is probably it! Because for the github cache to work it needs the https://github.com/docker/setup-buildx-action before. And that starts a dedicated docker image.

joh-klein commented 2 months ago

Alright – I got it to work. Partially :) I have to setup buildx like this (and add some permissions to k8s etc):

- name: Set up Docker Buildx
  uses: docker/setup-buildx-action@v3
  with:
    driver: kubernetes

But then I finally get the output from the cache server:

[cache-server] ⚙ No exact primary matches found
[cache-server] ⚙ No restore keys provided
[cache-server] ⚙ Get: No cache entry found for [ 'buildkit-blob-1-sha256:0d0c16747d2c6b6c26c064652afcb964c15f1b1e596ec052b2aa19b83948ae27' ] 693bb7016429d80366022f036f84856888c9f13e00145f5f6f4dce303a38d6f2
[cache-server] ⚙ Get: Getting cache entry for [ 'buildkit-blob-1-sha256:0d0c16747d2c6b6c26c064652afcb964c15f1b1e596ec052b2aa19b83948ae27' ] 693bb7016429d80366022f036f84856888c9f13e00145f5f6f4dce303a38d6f2
[cache-server] ⚙ Finding key match { key:
   'buildkit-blob-1-sha256:0d0c16747d2c6b6c26c064652afcb964c15f1b1e596ec052b2aa19b83948ae27',
  version: '693bb7016429d80366022f036f84856888c9f13e00145f5f6f4dce303a38d6f2',
  restoreKeys: undefined }

So, lookup works. But uploading seems to be a problem. But that is in the next issue …