docker / setup-buildx-action

GitHub Action to set up Docker Buildx
https://github.com/marketplace/actions/docker-setup-buildx
Apache License 2.0
984 stars 156 forks source link

setup-buildx-action hangs with 99% CPU when running on latest Github Action Runner v2.285.0 #117

Closed dresselm closed 2 years ago

dresselm commented 2 years ago

Behaviour

Our runners were auto-updated this morning from v2.284.0 to v2.285.0 and began to hang on the Download and Install Buildx step:

Screen Shot 2021-12-01 at 5 24 52 PM

Steps to reproduce this issue

  1. Upgrade your runner to version v2.285.0
  2. Run the docker/setup-buildx-action@v1 in your workflow
  3. You should see heavy CPU utilization that did not exist in v2.284.0 image

Expected behaviour

The step should complete in under 10s per our typical experience

Actual behaviour

The step is hanging indefinitely

Configuration

Note: all steps prior to the failing step pass as expected

name: Test Environment

on:
  pull_request:
    types: [labeled, reopened, synchronize, ready_for_review]
  workflow_dispatch:

env:
  HEAD_SHA: ${{ github.event.pull_request.head.sha }}

jobs:
  build:
    name: Build Image
    runs-on: self-hosted
    steps:
      - name: Set outputs
        id: vars
        run: |
          echo "::set-output name=head_sha::$(echo $PR_HEAD_SHA | cut -c1-7)"
        env:
          PR_HEAD_SHA: ${{ env.HEAD_SHA }}

      - name: Login to DockerHub
        uses: docker/login-action@v1
        with:
          username: ${{ secrets.DOCKER_HUB_USERNAME }}
          password: ${{ secrets.DOCKER_HUB_ACCESS_TOKEN }}

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v1
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: redacted
          role-to-assume: redacted

      - name: Login to Amazon ECR
        id: login-ecr
        uses: aws-actions/amazon-ecr-login@v1

      - name: Set up Docker Context for Buildx
        id: buildx-context
        run: |
          docker context create test-builder

      - name: Set up Docker Buildx
        id: buildx
        uses: docker/setup-buildx-action@v1
        with:
          endpoint: test-builder

      - name: Build and push Docker images
        uses: docker/build-push-action@v2.7.0

Logs

Download the log file of your build and attach it to this issue.

Our logs are littered with private, implementation details. Could you help me isolate which part of the logs you need for diagnosis?

2021-12-01T19:25:54.2902593Z Found online and idle self-hosted runner in the  current repository's enterprise account that matches the required labels: 'self-hosted'
2021-12-01T19:25:54.4048239Z Waiting for the runner to pick up this job...
2021-12-01T19:26:54.7363687Z Job was not picked up by the preferred runner, try requesting additional runners to run this job
2021-12-01T19:26:54.8282602Z Waiting for a self-hosted runner to pickup this job...
2021-12-01T19:32:40.3175224Z Job is about to start running on the runner: self-hosted-runner-group-6zjgq-67xpv (enterprise)
2021-12-01T19:32:44.8744015Z Current runner version: '2.285.0'
2021-12-01T19:32:44.8753670Z Runner name: 'self-hosted-runner-group-6zjgq-67xpv'
2021-12-01T19:32:44.8755541Z Runner group name: 'self-hosted'
2021-12-01T19:32:44.8758083Z Machine name: 'self-hosted-runner-group-6zjgq-67xpv'
...
2021-12-01T19:32:49.5211515Z ##[group]Download and install buildx
2021-12-01T19:32:49.5825561Z Downloading https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64
2021-12-01T20:12:24.6445002Z ##[error]The operation was canceled.
2021-12-01T20:12:24.6564749Z Post job cleanup.
2021-12-01T20:12:24.7568406Z Post job cleanup.
2021-12-01T20:12:24.9055991Z Post job cleanup.
2021-12-01T20:12:24.9780442Z Post job cleanup.
2021-12-01T20:12:25.0473555Z [command]/usr/local/bin/docker logout 
2021-12-01T20:12:25.1428510Z Removing login credentials for https://index.docker.io/v1/
2021-12-01T20:12:25.1518891Z Cleaning up orphan processes
ghostsquad commented 2 years ago

I just ran into the same issue, as I attempt a new setup using private hosted runners. I thought it was related to running this in k8s, but this issue describes the exact same behavior. Downloading... and it just hangs there.

ghostsquad commented 2 years ago

I'd also like to confirm that when I switch to v2.284.0 this works.

apiVersion: actions.summerwind.dev/v1alpha1
kind: Runner
metadata:
  name: example-runner
  namespace: actions-runner-system
spec:
  repository: <redacted>
  image: summerwind/actions-runner:v2.284.0-ubuntu-20.04
  env: []
dresselm commented 2 years ago

@ghostsquad when you make this change (summerwind/actions-runner:v2.284.0-ubuntu-20.04) to the runner, it uses the older node version (v12.22.7) here instead of the updated node version (v12.13.1)? I haven't tried this yet, but will check it out tomorrow. I have been shelling into the runner and checking this: /runner/externals/node12/bin/node -v and it keeps using the new version and failing.

Another interesting thing I discovered is that this action runs runs using v12 but the latest GitHub runner (v2.285.0) shipped with v16 support as well.

crazy-max commented 2 years ago

@dresselm

Another interesting thing I discovered is that this action runs runs using v12 but the latest GitHub runner (v2.285.0) shipped with v16 support as well.

Interesting indeed, are you on GHE?

crazy-max commented 2 years ago

Our e2e workflow works fine with runner version: '2.285.0':

image

Are you using a self-hosted runner?

jnsvd commented 2 years ago

We're seeing the same error on self-hosted runners. Version auto-changed overnight and now the job is timing out on the uses: docker/setup-buildx-action@v1.5.0 step:

  WARNING: Plugin "/home/runner/.docker/cli-plugins/docker-buildx" is not valid: failed to fetch metadata: fork/exec /home/runner/.docker/cli-plugins/docker-buildx: permission denied
  WARNING: No swap limit support
Download and install buildx
  Error: The operation was canceled.
damyan commented 2 years ago

Yet another "same here". Self-hosted k8s runners on a private repo stuck forever at the buildx step.

rjhenry commented 2 years ago

Seeing the same here, first noted on Tuesday afternoon (UK time)

crazy-max commented 2 years ago

Can someone provide the workflow logs of an affected runner please (private info redacted ofc)? Thanks.

jnsvd commented 2 years ago

@crazy-max

Set-up job logs:

2021-12-02T13:24:49.5232557Z Current runner version: '2.285.0'
2021-12-02T13:24:49.5238376Z Runner name: 'shared-vjx25-5cz76'
2021-12-02T13:24:49.5239071Z Runner group name: 'Default'
2021-12-02T13:24:49.5240198Z Machine name: 'shared-vjx25-5cz76'
2021-12-02T13:24:49.5243337Z ##[group]GITHUB_TOKEN Permissions
2021-12-02T13:24:49.5244528Z Actions: write
2021-12-02T13:24:49.5245038Z Checks: write
2021-12-02T13:24:49.5245509Z Contents: write
2021-12-02T13:24:49.5246054Z Deployments: write
2021-12-02T13:24:49.5246576Z Discussions: write
2021-12-02T13:24:49.5247045Z Issues: write
2021-12-02T13:24:49.5247521Z Metadata: read
2021-12-02T13:24:49.5247996Z Packages: write
2021-12-02T13:24:49.5248538Z Pages: write
2021-12-02T13:24:49.5249020Z PullRequests: write
2021-12-02T13:24:49.5249610Z RepositoryProjects: write
2021-12-02T13:24:49.5250185Z SecurityEvents: write
2021-12-02T13:24:49.5250721Z Statuses: write
2021-12-02T13:24:49.5251310Z ##[endgroup]
2021-12-02T13:24:49.5253761Z Secret source: Actions
2021-12-02T13:24:49.5254650Z Prepare workflow directory
2021-12-02T13:24:49.6124677Z Prepare all required actions
2021-12-02T13:24:49.6133014Z Getting action download info
2021-12-02T13:24:49.9042420Z Download action repository 'actions/checkout@v2' (SHA:ec3a7ce113134d7a93b817d10a8272cb61118579)
2021-12-02T13:24:50.3898900Z Download action repository 'actions/setup-node@v2.2.0' (SHA:38d90ce44d5275ad62cc48384b3d8a58c500bb5f)
2021-12-02T13:24:50.9495641Z Download action repository 'actions/cache@v2' (SHA:937d24475381cd9c75ae6db12cb4e79714b926ed)
2021-12-02T13:24:51.4209320Z Download action repository 'azure/CLI@v1' (SHA:7378ce2ca3c38b4b063feb7a4cbe384fef978055)
2021-12-02T13:24:51.9278970Z Download action repository 'docker/login-action@v1.10.0' (SHA:f054a8b539a109f9f41c372932f1ae047eff08c9)
2021-12-02T13:24:52.2985638Z Download action repository 'docker/setup-buildx-action@v1.5.0' (SHA:e673438944759779e411a0f7ceef3ba437dccfa0)
2021-12-02T13:24:52.7750364Z Download action repository 'docker/build-push-action@v2.6.1' (SHA:1bc1040caef9e604eb543693ba89b5bf4fc80935)

docker/setup-buildx-action@v1.5.0 logs:

2021-12-02T13:26:53.7736865Z ##[group]Run docker/setup-buildx-action@v1.5.0
2021-12-02T13:26:53.7737409Z with:
2021-12-02T13:26:53.7737792Z   driver: docker
2021-12-02T13:26:53.7738728Z   buildkitd-flags: --allow-insecure-entitlement security.insecure --allow-insecure-entitlement network.host
2021-12-02T13:26:53.7739669Z   install: false
2021-12-02T13:26:53.7740122Z   use: true
2021-12-02T13:26:53.7740488Z env:
2021-12-02T13:26:53.7741443Z   EXPOSE_DOMAIN_NAME: <redacted>
2021-12-02T13:26:53.7742918Z   ACR_PREVIEW: <redacted>
2021-12-02T13:26:53.7743756Z   VERSION: pr-3-18
2021-12-02T13:26:53.7886312Z   TOKEN: ***
2021-12-02T13:26:53.7886738Z ##[endgroup]
2021-12-02T13:26:53.8429557Z ##[group]Docker info
2021-12-02T13:26:53.8458069Z [command]/usr/local/bin/docker version
2021-12-02T13:26:53.8912983Z Client:
2021-12-02T13:26:53.8915314Z  Version:           20.10.8
2021-12-02T13:26:53.8915801Z  API version:       1.41
2021-12-02T13:26:53.8918136Z  Go version:        go1.16.6
2021-12-02T13:26:53.8918888Z  Git commit:        3967b7d
2021-12-02T13:26:53.8919435Z  Built:             Fri Jul 30 19:50:40 2021
2021-12-02T13:26:53.8921369Z  OS/Arch:           linux/amd64
2021-12-02T13:26:53.8921845Z  Context:           default
2021-12-02T13:26:53.8922455Z  Experimental:      true
2021-12-02T13:26:53.8923729Z 
2021-12-02T13:26:53.8925459Z Server: Docker Engine - Community
2021-12-02T13:26:53.8926831Z  Engine:
2021-12-02T13:26:53.8927325Z   Version:          20.10.9
2021-12-02T13:26:53.8928714Z   API version:      1.41 (minimum version 1.12)
2021-12-02T13:26:53.8929542Z   Go version:       go1.16.8
2021-12-02T13:26:53.8930283Z   Git commit:       79ea9d3
2021-12-02T13:26:53.8930946Z   Built:            Mon Oct  4 16:07:30 2021
2021-12-02T13:26:53.8931662Z   OS/Arch:          linux/amd64
2021-12-02T13:26:53.8932425Z   Experimental:     false
2021-12-02T13:26:53.8933640Z  containerd:
2021-12-02T13:26:53.8934337Z   Version:          v1.4.11
2021-12-02T13:26:53.8935253Z   GitCommit:        5b46e404f6b9f661a205e28d59c982d3634148f8
2021-12-02T13:26:53.8936558Z  runc:
2021-12-02T13:26:53.8937150Z   Version:          1.0.2
2021-12-02T13:26:53.8938645Z   GitCommit:        v1.0.2-0-g52b36a2d
2021-12-02T13:26:53.8939681Z  docker-init:
2021-12-02T13:26:53.8940315Z   Version:          0.19.0
2021-12-02T13:26:53.8941084Z   GitCommit:        de40ad0
2021-12-02T13:26:53.8952687Z [command]/usr/local/bin/docker info
2021-12-02T13:26:53.9405279Z Client:
2021-12-02T13:26:53.9406048Z  Context:    default
2021-12-02T13:26:53.9406650Z  Debug Mode: false
2021-12-02T13:26:53.9407019Z 
2021-12-02T13:26:53.9407499Z Server:
2021-12-02T13:26:53.9407932Z  Containers: 0
2021-12-02T13:26:53.9408407Z   Running: 0
2021-12-02T13:26:53.9408826Z   Paused: 0
2021-12-02T13:26:53.9409270Z   Stopped: 0
2021-12-02T13:26:53.9409669Z  Images: 1
2021-12-02T13:26:53.9410099Z  Server Version: 20.10.9
2021-12-02T13:26:53.9410593Z  Storage Driver: overlay2
2021-12-02T13:26:53.9411109Z   Backing Filesystem: extfs
2021-12-02T13:26:53.9411612Z   Supports d_type: true
2021-12-02T13:26:53.9412103Z   Native Overlay Diff: false
2021-12-02T13:26:53.9412589Z   userxattr: false
2021-12-02T13:26:53.9413999Z  Logging Driver: json-file
2021-12-02T13:26:53.9414541Z  Cgroup Driver: cgroupfs
2021-12-02T13:26:53.9415026Z  Cgroup Version: 1
2021-12-02T13:26:53.9415474Z  Plugins:
2021-12-02T13:26:53.9415882Z   Volume: local
2021-12-02T13:26:53.9416428Z   Network: bridge host ipvlan macvlan null overlay
2021-12-02T13:26:53.9417424Z   Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
2021-12-02T13:26:53.9418092Z  Swarm: inactive
2021-12-02T13:26:53.9418942Z  Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
2021-12-02T13:26:53.9419836Z  Default Runtime: runc
2021-12-02T13:26:53.9420492Z  Init Binary: docker-init
2021-12-02T13:26:53.9421142Z  containerd version: 5b46e404f6b9f661a205e28d59c982d3634148f8
2021-12-02T13:26:53.9421911Z  runc version: v1.0.2-0-g52b36a2d
2021-12-02T13:26:53.9422417Z  init version: de40ad0
2021-12-02T13:26:53.9423218Z  Security Options:
2021-12-02T13:26:53.9423614Z   apparmor
2021-12-02T13:26:53.9423999Z   seccomp
2021-12-02T13:26:53.9424432Z    Profile: default
2021-12-02T13:26:53.9425035Z  Kernel Version: 5.4.0-1063-azure
2021-12-02T13:26:53.9425667Z  Operating System: Alpine Linux v3.14 (containerized)
2021-12-02T13:26:53.9426213Z  OSType: linux
2021-12-02T13:26:53.9426628Z  Architecture: x86_64
2021-12-02T13:26:53.9427054Z  CPUs: 8
2021-12-02T13:26:53.9427474Z  Total Memory: 31.36GiB
2021-12-02T13:26:53.9428268Z  Name: shared-vjx25-5cz76
2021-12-02T13:26:53.9428857Z  ID: MKF2:72OS:YKOV:2E5D:635C:TQSY:YW63:FN5L:JMGK:HTZQ:X6CY:C4XG
2021-12-02T13:26:53.9429427Z  Docker Root Dir: /var/lib/docker
2021-12-02T13:26:53.9429926Z  Debug Mode: false
2021-12-02T13:26:53.9430539Z  Registry: https://index.docker.io/v1/
2021-12-02T13:26:53.9431080Z  Labels:
2021-12-02T13:26:53.9431509Z  Experimental: false
2021-12-02T13:26:53.9432013Z  Insecure Registries:
2021-12-02T13:26:53.9432402Z   127.0.0.0/8
2021-12-02T13:26:53.9432838Z  Live Restore Enabled: false
2021-12-02T13:26:53.9433380Z  Product License: Community Engine
2021-12-02T13:26:53.9433715Z 
2021-12-02T13:26:53.9434164Z WARNING: No swap limit support
2021-12-02T13:26:53.9435253Z ##[endgroup]
2021-12-02T13:26:53.9851668Z ##[group]Download and install buildx
2021-12-02T13:26:54.1115979Z Downloading https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64
2021-12-02T13:38:51.7240440Z ##[error]The operation was canceled.
crazy-max commented 2 years ago

@jnsvd I'm really confused. Can you add this step before the setup-buildx-action step to see if it's linked to the actions/tool-cache pkg that downloads buildx?:

      - name: Install Docker Buildx
        run: |
          mkdir -p "$HOME/.docker/cli-plugins"
          curl -SsL "https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64" -o "$HOME/.docker/cli-plugins/docker-buildx"
          chmod +x "$HOME/.docker/cli-plugins/docker-buildx"

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v1
jnsvd commented 2 years ago

@crazy-max If we add the extra Install Docker Buildx step the pipeline finished successfully.

jnsvd commented 2 years ago

@crazy-max Step actions/cache@v2 doesn't seem to be related to the problem, when removing it (and your additional step), the pipeline is still freezing on the docker/setup-buildx-action step. When applying your step and the actions/cache@v2 the pipeline ran to completion.

richardpeng commented 2 years ago

I can confirm that the Install Docker Buildx step above does allow the setup-buildx-action to run successfully.

crazy-max commented 2 years ago

Ok thanks for your feedback! One last thing, can you add the ACTIONS_RUNNER_DEBUG secret to true and give me the output? It could tell us if this is an issue with the actions/tool-cache.

@jnsvd

Step actions/cache@v2 doesn't seem to be related to the problem

This is not about the actions/cache action but the node module actions/tool-cache we use in the setup-buildx-action: https://github.com/actions/toolkit/tree/main/packages/tool-cache.

richardpeng commented 2 years ago

Let me know if you need more logs than this, but you can see that the fixed version goes to Creating a new builder instance instead of getting stuck on Download and install buildx

CleanShot 2021-12-02 at 11 07 21@2x CleanShot 2021-12-02 at 11 07 12@2x

The operation was canceled. is misleading. The action never times out. That message appears when you cancel the workflow.

jnsvd commented 2 years ago

Ok thanks for your feedback! One last thing, can you add the ACTIONS_RUNNER_DEBUG secret to true and give me the output? It could tell us if this is an issue with the actions/tool-cache.

There's no more output when enabling diagnostic logging. The pipeline really seems to get stuck on Downloading https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64 somehow.

richardpeng commented 2 years ago

Would it be possible to add more logging for each of the steps? That might help future debugging

crazy-max commented 2 years ago

@jnsvd

There's no more output when enabling diagnostic logging. The pipeline really seems to get stuck on Downloading https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64 somehow.

Oh my bad it's ACTIONS_STEP_DEBUG to true and you should have smth like this:

image

crazy-max commented 2 years ago

Friendly ping @thboop @luketomlinson. Are you aware of some issues with the actions/tool-cache module on self-hosted runners updated to the latest v2.285.0?

We use the downloadTool and cacheFile functions here from actions/tool-cache and it seems to be frozen while caching. Any idea?

dresselm commented 2 years ago

@crazy-max @richardpeng

Regarding adding more debug logging, I was stuck yesterday even though I had ACTIONS_STEP_DEBUG=true setup correctly. Primary reason was that I saw the last line Plugin path... but wasn't seeing Fixing perms and there was no additional debug details describing what was happening while it was stuck.

debug-logs

richardpeng commented 2 years ago

Github is slowly rolling out updates to their runners so it took many tries to get a run on an upgraded self-hosted runner....

Here are the logs:

CleanShot 2021-12-02 at 13 05 35@2x

There are no further logs. It's just stuck at that spot

Update:

CleanShot 2021-12-02 at 13 18 10@2x
crazy-max commented 2 years ago

It looks like this line is the culprit:

https://github.com/docker/setup-buildx-action/blob/79abd3f86f79a9d68a23c75a09a9a85889262adf/src/buildx.ts#L200

I wonder if this is not a bug with fs.copyFileSync in the node version 12.22.7 added in v2.285.0 of the runner as shown in https://github.com/nodejs/node/issues/40200

No specific reports on your side @TingluoHuang @thboop?

TingluoHuang commented 2 years ago

@crazy-max This is the first report we got since we just bump the node12 packaged with the runner (v2.285.0) to that latest v12 LTS version. The issue you linked is for only node v16, isn't it?

crazy-max commented 2 years ago

@TingluoHuang

@crazy-max This is the first report we got since we just bump the node12 packaged with the runner (v2.285.0) to that latest v12 LTS version. The issue you linked is for only node v16, isn't it?

Yes indeed I try to find the root cause maybe it has been backported to 12.22.7 I'm not quite sure but anyway I tried again on a fresh runner 2.285.0 and can't repro (Ubuntu 20.04):

image

@richardpeng @dresselm @jnsvd @rjhenry @ghostsquad and others with this issue, can you show your Set up job logs like my screenshot? I wonder if the virtual env image should not been updated on your side. And also is this happening only on self-hosted runners for you all?

TingluoHuang commented 2 years ago

I think it might have something to do with the docker container volume mount. I am playing with it right now.

crazy-max commented 2 years ago

@TingluoHuang Thanks!

ghostsquad commented 2 years ago
Current runner version: '2.285.0'
Runner name: 'example-runner'
Runner group name: 'Default'
Machine name: 'example-runner'
GITHUB_TOKEN Permissions
Secret source: Actions
Prepare workflow directory
Prepare all required actions
Getting action download info
Download action repository 'actions/checkout@v2' (SHA:ec3a7ce113134d7a93b817d10a8272cb61118579)
Download action repository 'asdf-vm/actions@v1' (SHA:707e84f3ee349548310aeabdad0dd3bfcb9b69fa)
Download action repository 'docker/setup-buildx-action@v1' (SHA:94ab11c41e45d028884a99163086648e898eed25)
Download action repository 'actions/setup-go@v2' (SHA:331ce1d993939866bb63c32c6cbbfd48fa76fc57)
Download action repository 'actions/cache@v2' (SHA:937d24475381cd9c75ae6db12cb4e79714b926ed)

I'm not sure how useful this actually, is, since this was a private runner, it was using this image:

summerwind/actions-runner:v2.285.0-ubuntu-20.04
TingluoHuang commented 2 years ago

Can anyone with repro environment set environment NODE_DEBUG=fs when run the actions and share the log?

TingluoHuang commented 2 years ago

I can't repro with just docker + summerwind/actions-runner:v2.285.0-ubuntu-20.04

runner@fv-az43-260:~/work/ting-test/ting-test$ docker run -it --rm summerwind/actions-runner:v2.285.0-ubuntu-20.04 bash
Unable to find image 'summerwind/actions-runner:v2.285.0-ubuntu-20.04' locally
v2.285.0-ubuntu-20.04: Pulling from summerwind/actions-runner
7b1a6ab2e44d: Already exists 
8dccfeffb9d5: Pull complete 
677f2ea3860b: Pull complete 
a7e2a0869a62: Pull complete 
3a85d00f8ff6: Pull complete 
578b8578fd95: Pull complete 
c4b1e29c1c08: Pull complete 
ae9503151e38: Pull complete 
Digest: sha256:d08b7acb0f31c37ee50383a8a37164bb7401e1f88a7e7cff879934bf2f8c675a
Status: Downloaded newer image for summerwind/actions-runner:v2.285.0-ubuntu-20.04
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.

runner@da451f87281b:/$ ls 
bin  boot  dev  entrypoint.sh  etc  home  lib  lib32  lib64  libx32  media  mnt  opt  proc  root  run  runnertmp  sbin  srv  sys  tmp  usr  var
runner@da451f87281b:/$ cd runnertmp/
runner@da451f87281b:/runnertmp$ ls
bin  config.sh  env.sh  externalstmp  patched  run.sh
runner@da451f87281b:/runnertmp$ curl -L -O https://github.com/TingluoHuang/setup-buildx-action/archive/refs/heads/master.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   151  100   151    0     0   1935      0 --:--:-- --:--:-- --:--:--  1935
100  244k    0  244k    0     0  1111k      0 --:--:-- --:--:-- --:--:-- 1111k
runner@da451f87281b:/runnertmp$ sudo tar xzf ./master.tar.gz 
runner@da451f87281b:/runnertmp$ ls
bin  config.sh  env.sh  externalstmp  master.tar.gz  patched  run.sh  setup-buildx-action-master
runner@da451f87281b:/runnertmp$ cd setup-buildx-action-master/
runner@da451f87281b:/runnertmp/setup-buildx-action-master$ ls
LICENSE  README.md  __tests__  action.yml  codecov.yml  dist  docker-bake.hcl  hack  jest.config.js  package-lock.json  package.json  src  tsconfig.json  yarn.lock   
runner@da451f87281b:/runnertmp$ mkdir tmp
runner@da451f87281b:/runnertmp$ export RUNNER_TEMP=/run
run/       runnertmp/ 
runner@da451f87281b:/runnertmp$ export RUNNER_TEMP=/runnertmp/tmp/
runner@da451f87281b:/runnertmp$ ./externalstmp/node12/bin/node ./setup-buildx-action-master/dist/index.js 
::group::Download and install buildx
::debug::Release v0.7.1 found
::debug::isExplicit: 0.7.1
::debug::explicit? true
::debug::checking cache: /opt/hostedtoolcache/buildx/0.7.1/x64
::debug::not found
Downloading https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64
::debug::Downloading https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64
::debug::Destination /runnertmp/tmp/014dd3b9-91d0-4d90-9824-9b860735f619
::debug::download complete
::debug::Downloaded to /runnertmp/tmp/014dd3b9-91d0-4d90-9824-9b860735f619
::debug::Caching tool buildx 0.7.1 x64
::debug::source file: /runnertmp/tmp/014dd3b9-91d0-4d90-9824-9b860735f619
::debug::destination /opt/hostedtoolcache/buildx/0.7.1/x64
::debug::destination file /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx
::debug::finished caching tool
::debug::Plugins dir is /home/runner/.docker/cli-plugins
::debug::Plugin path is /home/runner/.docker/cli-plugins/docker-buildx
Fixing perms
::endgroup::
runner@da451f87281b:/runnertmp$ 

Might have something to do with Kubernetes, trying it out...

TingluoHuang commented 2 years ago

Also can't repro with minikube on my ubuntu VM...

runner@fv-az43-260:~/work/ting-test/ting-test$ cat <<EOF | kubectl apply -f -
> apiVersion: v1
> kind: Pod
> metadata:
>   name: command-demo
> spec:
>   containers:
>   - name: command-demo-container
>     image: summerwind/actions-runner:v2.285.0-ubuntu-20.04
>     command: ["sleep"]
>     args: ["100000"]
> EOF
pod/command-demo created
runner@fv-az43-260:~/work/ting-test/ting-test$ kubectl exec -it command-demo bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.

runner@command-demo:/$ ls
bin  boot  dev  entrypoint.sh  etc  home  lib  lib32  lib64  libx32  media  mnt  opt  proc  root  run  runnertmp  sbin  srv  sys  tmp  usr  var
runner@command-demo:/$ cd runnertmp/
runner@command-demo:/runnertmp$ curl -L -O https://github.com/TingluoHuang/setup-buildx-action/archive/refs/heads/master.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   151  100   151    0     0   1451      0 --:--:-- --:--:-- --:--:--  1451
100  244k    0  244k    0     0  1010k      0 --:--:-- --:--:-- --:--:-- 1010k
runner@command-demo:/runnertmp$ tar xcf ./master.tar.gz 
tar: You may not specify more than one '-Acdtrux', '--delete' or  '--test-label' option
Try 'tar --help' or 'tar --usage' for more information.
runner@command-demo:/runnertmp$ tar xzf ./master.tar.gz                                                                                                                                                                                        
runner@command-demo:/runnertmp$ mkdir tmp
runner@command-demo:/runnertmp$ export RUNNER_TEMP=/runnertmp/tmp 
runner@command-demo:/runnertmp$ ./externalstmp/node12/bin/node ./setup-buildx-action-master/dist/index.js 
::group::Download and install buildx
::debug::Release v0.7.1 found
::debug::isExplicit: 0.7.1
::debug::explicit? true
::debug::checking cache: /opt/hostedtoolcache/buildx/0.7.1/x64
::debug::not found
Downloading https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64
::debug::Downloading https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64
::debug::Destination /runnertmp/tmp/7eb2f149-3be0-427d-b3e4-42898e526a31
::debug::download complete
::debug::Downloaded to /runnertmp/tmp/7eb2f149-3be0-427d-b3e4-42898e526a31
::debug::Caching tool buildx 0.7.1 x64
::debug::source file: /runnertmp/tmp/7eb2f149-3be0-427d-b3e4-42898e526a31
::debug::destination /opt/hostedtoolcache/buildx/0.7.1/x64
::debug::destination file /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx
::debug::finished caching tool
::debug::Plugins dir is /home/runner/.docker/cli-plugins
::debug::Plugin path is /home/runner/.docker/cli-plugins/docker-buildx
Fixing perms
::endgroup::
runner@command-demo:/runnertmp$ 

Can anyone with repro environment try what I have run and see if they get a repro?

BTW, I forked the repro and remove all other code and only keep the part of downloading and filecopy in https://github.com/TingluoHuang/setup-buildx-action

damyan commented 2 years ago

.Can anyone with repro environment try what I have run and see if they get a repro?

Could not reproduce it your way on a failing self-hosted runner (v2.285.0-ubuntu-20.04)

jnsvd commented 2 years ago

@richardpeng @dresselm @jnsvd @rjhenry @ghostsquad and others with this issue, can you show your Set up job logs like my screenshot? I wonder if the virtual env image should not been updated on your side. And also is this happening only on self-hosted runners for you all?

@crazy-max I already shared my Set up job logs earlier, note that it doesn't provide any information on a virtual environment. Probably not something shown for self-hosted runners? There's not much info in that section besides the runner's version.

Can anyone with repro environment set environment NODE_DEBUG=fs when run the actions and share the log?

@TingluoHuang I'm getting the same output that @richardpeng is seeing:

name: Setup Docker buildx
uses: docker/setup-buildx-action@v1.6.0
with:
  driver: docker
env:
  NODE_DEBUG: fs
2021-12-03T09:02:35.4142614Z ##[debug]Evaluating condition for step: 'Setup Docker buildx'
2021-12-03T09:02:35.4144259Z ##[debug]Evaluating: success()
2021-12-03T09:02:35.4145026Z ##[debug]Evaluating success:
2021-12-03T09:02:35.4145772Z ##[debug]=> true
2021-12-03T09:02:35.4146536Z ##[debug]Result: true
2021-12-03T09:02:35.4147438Z ##[debug]Starting: Setup Docker buildx
2021-12-03T09:02:35.4167907Z ##[debug]Register post job cleanup for action: docker/setup-buildx-action@v1.6.0
2021-12-03T09:02:35.4181341Z ##[debug]Loading inputs
2021-12-03T09:02:35.4186780Z ##[debug]Loading env
2021-12-03T09:02:35.4193642Z ##[group]Run docker/setup-buildx-action@v1.6.0
2021-12-03T09:02:35.4194118Z with:
2021-12-03T09:02:35.4194462Z   driver: docker
2021-12-03T09:02:35.4195352Z   buildkitd-flags: --allow-insecure-entitlement security.insecure --allow-insecure-entitlement network.host
2021-12-03T09:02:35.4196216Z   install: false
2021-12-03T09:02:35.4196660Z   use: true
2021-12-03T09:02:35.4197197Z env:
2021-12-03T09:02:35.4198282Z   EXPOSE_DOMAIN_NAME: <redacted>
2021-12-03T09:02:35.4200801Z   ACR_PREVIEW: <redacted>
2021-12-03T09:02:35.4201593Z   VERSION: pr-4-26
2021-12-03T09:02:35.4310032Z   TOKEN: ***
2021-12-03T09:02:35.4310414Z   NODE_DEBUG: fs
2021-12-03T09:02:35.4310780Z ##[endgroup]
2021-12-03T09:02:35.4866965Z ::save-state name=isPost::true
2021-12-03T09:02:35.4867592Z ##[debug]Save intra-action state isPost = true
2021-12-03T09:02:35.4871957Z ::group::Docker info
2021-12-03T09:02:35.4872395Z ##[group]Docker info
2021-12-03T09:02:35.4901885Z [command]/usr/local/bin/docker version
2021-12-03T09:02:35.5365418Z Client:
2021-12-03T09:02:35.5368133Z  Version:           20.10.8
2021-12-03T09:02:35.5368703Z  API version:       1.41
2021-12-03T09:02:35.5369258Z  Go version:        go1.16.6
2021-12-03T09:02:35.5369795Z  Git commit:        3967b7d
2021-12-03T09:02:35.5370370Z  Built:             Fri Jul 30 19:50:40 2021
2021-12-03T09:02:35.5370928Z  OS/Arch:           linux/amd64
2021-12-03T09:02:35.5371470Z  Context:           default
2021-12-03T09:02:35.5372035Z  Experimental:      true
2021-12-03T09:02:35.5372568Z 
2021-12-03T09:02:35.5374074Z Server: Docker Engine - Community
2021-12-03T09:02:35.5374541Z  Engine:
2021-12-03T09:02:35.5374922Z   Version:          20.10.9
2021-12-03T09:02:35.5375395Z   API version:      1.41 (minimum version 1.12)
2021-12-03T09:02:35.5375877Z   Go version:       go1.16.8
2021-12-03T09:02:35.5376301Z   Git commit:       79ea9d3
2021-12-03T09:02:35.5376743Z   Built:            Mon Oct  4 16:07:30 2021
2021-12-03T09:02:35.5377209Z   OS/Arch:          linux/amd64
2021-12-03T09:02:35.5377848Z   Experimental:     false
2021-12-03T09:02:35.5378282Z  containerd:
2021-12-03T09:02:35.5378685Z   Version:          v1.4.11
2021-12-03T09:02:35.5379277Z   GitCommit:        5b46e404f6b9f661a205e28d59c982d3634148f8
2021-12-03T09:02:35.5379791Z  runc:
2021-12-03T09:02:35.5380419Z   Version:          1.0.2
2021-12-03T09:02:35.5381075Z   GitCommit:        v1.0.2-0-g52b36a2d
2021-12-03T09:02:35.5381619Z  docker-init:
2021-12-03T09:02:35.5382023Z   Version:          0.19.0
2021-12-03T09:02:35.5382430Z   GitCommit:        de40ad0
2021-12-03T09:02:35.5399999Z [command]/usr/local/bin/docker info
2021-12-03T09:02:35.5851091Z Client:
2021-12-03T09:02:35.5851815Z  Context:    default
2021-12-03T09:02:35.5852354Z  Debug Mode: false
2021-12-03T09:02:35.5852698Z 
2021-12-03T09:02:35.5853153Z Server:
2021-12-03T09:02:35.5853983Z  Containers: 0
2021-12-03T09:02:35.5854485Z   Running: 0
2021-12-03T09:02:35.5854857Z   Paused: 0
2021-12-03T09:02:35.5855223Z   Stopped: 0
2021-12-03T09:02:35.5855569Z  Images: 1
2021-12-03T09:02:35.5855964Z  Server Version: 20.10.9
2021-12-03T09:02:35.5856412Z  Storage Driver: overlay2
2021-12-03T09:02:35.5856894Z   Backing Filesystem: extfs
2021-12-03T09:02:35.5857352Z   Supports d_type: true
2021-12-03T09:02:35.5857808Z   Native Overlay Diff: false
2021-12-03T09:02:35.5858272Z   userxattr: false
2021-12-03T09:02:35.5859299Z  Logging Driver: json-file
2021-12-03T09:02:35.5859984Z  Cgroup Driver: cgroupfs
2021-12-03T09:02:35.5860411Z  Cgroup Version: 1
2021-12-03T09:02:35.5860779Z  Plugins:
2021-12-03T09:02:35.5861231Z   Volume: local
2021-12-03T09:02:35.5861737Z   Network: bridge host ipvlan macvlan null overlay
2021-12-03T09:02:35.5862699Z   Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
2021-12-03T09:02:35.5863338Z  Swarm: inactive
2021-12-03T09:02:35.5864095Z  Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
2021-12-03T09:02:35.5864825Z  Default Runtime: runc
2021-12-03T09:02:35.5865383Z  Init Binary: docker-init
2021-12-03T09:02:35.5866016Z  containerd version: 5b46e404f6b9f661a205e28d59c982d3634148f8
2021-12-03T09:02:35.5866871Z  runc version: v1.0.2-0-g52b36a2d
2021-12-03T09:02:35.5867419Z  init version: de40ad0
2021-12-03T09:02:35.5867901Z  Security Options:
2021-12-03T09:02:35.5868297Z   apparmor
2021-12-03T09:02:35.5868665Z   seccomp
2021-12-03T09:02:35.5869162Z    Profile: default
2021-12-03T09:02:35.5869844Z  Kernel Version: 5.4.0-1063-azure
2021-12-03T09:02:35.5870616Z  Operating System: Alpine Linux v3.14 (containerized)
2021-12-03T09:02:35.5871356Z  OSType: linux
2021-12-03T09:02:35.5874152Z  Architecture: x86_64
2021-12-03T09:02:35.5874682Z  CPUs: 8
2021-12-03T09:02:35.5875191Z  Total Memory: 31.36GiB
2021-12-03T09:02:35.5876055Z  Name: shared-6424v-hgmpk
2021-12-03T09:02:35.5876833Z  ID: 2EHW:36OS:CJ6O:HM7O:K6BZ:5NP2:IBZD:OU4L:5526:ZJZH:J6PK:P445
2021-12-03T09:02:35.5877541Z  Docker Root Dir: /var/lib/docker
2021-12-03T09:02:35.5878130Z  Debug Mode: false
2021-12-03T09:02:35.5878874Z  Registry: https://index.docker.io/v1/
2021-12-03T09:02:35.5879563Z  Labels:
2021-12-03T09:02:35.5880106Z  Experimental: false
2021-12-03T09:02:35.5880717Z  Insecure Registries:
2021-12-03T09:02:35.5881227Z   127.0.0.0/8
2021-12-03T09:02:35.5881817Z  Live Restore Enabled: false
2021-12-03T09:02:35.5882532Z  Product License: Community Engine
2021-12-03T09:02:35.5882984Z 
2021-12-03T09:02:35.5883597Z WARNING: No swap limit support
2021-12-03T09:02:35.5884988Z ::endgroup::
2021-12-03T09:02:35.5885460Z ##[endgroup]
2021-12-03T09:02:35.6560952Z ::group::Download and install buildx
2021-12-03T09:02:35.6561563Z ##[group]Download and install buildx
2021-12-03T09:02:35.9540593Z ##[debug]Release v0.7.1 found
2021-12-03T09:02:35.9546050Z ##[debug]isExplicit: 0.7.1
2021-12-03T09:02:35.9547154Z ##[debug]explicit? true
2021-12-03T09:02:35.9548339Z ##[debug]checking cache: /opt/hostedtoolcache/buildx/0.7.1/x64
2021-12-03T09:02:35.9549650Z ##[debug]not found
2021-12-03T09:02:35.9556881Z Downloading https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64
2021-12-03T09:02:35.9565690Z ##[debug]Downloading https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64
2021-12-03T09:02:35.9567565Z ##[debug]Destination /runner/_work/_temp/965ebd31-5089-455e-a609-c6ce7fcfd06f
2021-12-03T09:02:38.0875929Z ##[debug]download complete
2021-12-03T09:02:38.0877910Z ##[debug]Downloaded to /runner/_work/_temp/965ebd31-5089-455e-a609-c6ce7fcfd06f
2021-12-03T09:02:38.0878983Z ##[debug]Caching tool buildx 0.7.1 x64
2021-12-03T09:02:38.0880252Z ##[debug]source file: /runner/_work/_temp/965ebd31-5089-455e-a609-c6ce7fcfd06f
2021-12-03T09:02:38.0881406Z ##[debug]destination /opt/hostedtoolcache/buildx/0.7.1/x64
2021-12-03T09:02:38.0888387Z ##[debug]destination file /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx
2021-12-03T09:02:38.1196557Z ##[debug]finished caching tool
2021-12-03T09:02:38.1198430Z ##[debug]Plugins dir is /home/runner/.docker/cli-plugins
2021-12-03T09:02:38.1199742Z ##[debug]Plugin path is /home/runner/.docker/cli-plugins/docker-buildx
Meroje commented 2 years ago

Seeing the same thing here with summewrinds controller on EKS, I was somehow able to unstuck it by removing /opt/hostedtoolcache/buildx/ on a runner showing the "Downloading" line.

TingluoHuang commented 2 years ago

Looks like this is an issue since node v12.19.0 image

v12.18.4 works fine... image

base on https://nodejs.org/uk/blog/release/v12.19.0/ the highly suspicious change is deps: upgrade to libuv 1.39.0 (Colin Ihrig) #34915

which lines up with people discussed in https://github.com/nodejs/node/issues/40200

crazy-max commented 2 years ago

I opened a PR to switch to node 16 #118, can someone try with crazy-max/docker-setup-buildx-action@node16?

crazy-max commented 2 years ago

@TingluoHuang Oh just in time thanks a lot for your feedback!

which lines up with people discussed in nodejs/node#40200

Yeah that was my first assumption in https://github.com/docker/setup-buildx-action/issues/117#issuecomment-984896179

TingluoHuang commented 2 years ago

node16 doesn't work, it hangs as well. 😢

image

TingluoHuang commented 2 years ago

it's related to the double copy... 😵‍💫

Fri, 03 Dec 2021 21:53:15 GMT
  export NODE_DEBUG=fs
Fri, 03 Dec 2021 21:53:15 GMT
  whoami      
Fri, 03 Dec 2021 21:53:15 GMT
  echo "try download and copy with curl and cp"
Fri, 03 Dec 2021 21:53:15 GMT
  curl -O -L https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64
Fri, 03 Dec 2021 21:53:15 GMT
  mkdir -p /opt/hostedtoolcache/buildx/0.7.1/x64
Fri, 03 Dec 2021 21:53:15 GMT
  cp -v ./buildx-v0.7.1.linux-amd64 /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx
Fri, 03 Dec 2021 21:53:15 GMT
  mkdir -p /home/runner/.docker/cli-plugins
Fri, 03 Dec 2021 21:53:15 GMT
  cp -v /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx /home/runner/.docker/cli-plugins/docker-buildx
Fri, 03 Dec 2021 21:53:15 GMT
  ls -l /home/runner/.docker/cli-plugins/docker-buildx
Fri, 03 Dec 2021 21:53:15 GMT

Fri, 03 Dec 2021 21:53:15 GMT
  node -v
Fri, 03 Dec 2021 21:53:15 GMT
  echo "try use node to copy from /opt/hostedtoolcache to /home/runner/.docker/cli-plugins"
Fri, 03 Dec 2021 21:53:15 GMT
  node -e "fs=require('fs'); fs.copyFileSync('/opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx', '/home/runner/.docker/cli-plugins/docker-buildx2'); "
Fri, 03 Dec 2021 21:53:15 GMT
  #node ./setup-buildx-action-master/dist/index.js 
Fri, 03 Dec 2021 21:53:15 GMT
  ls -l /home/runner/.docker/cli-plugins/docker-buildx2
Fri, 03 Dec 2021 21:53:15 GMT

Fri, 03 Dec 2021 21:53:15 GMT
  echo "try use node to copy from ./buildx-v0.7.1.linux-amd64 to /opt/hostedtoolcache first and then use cp to copy to /home/runner/.docker/cli-plugins"
Fri, 03 Dec 2021 21:53:15 GMT
  node -e "fs=require('fs'); fs.copyFileSync('./buildx-v0.7.1.linux-amd64', '/opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx3'); "
Fri, 03 Dec 2021 21:53:15 GMT
  ls -l /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx3
Fri, 03 Dec 2021 21:53:15 GMT
  cp -v /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx3 /home/runner/.docker/cli-plugins/docker-buildx4
Fri, 03 Dec 2021 21:53:15 GMT
  ls -l /home/runner/.docker/cli-plugins/docker-buildx4
Fri, 03 Dec 2021 21:53:15 GMT

Fri, 03 Dec 2021 21:53:15 GMT
  echo "try use node to copy from ./buildx-v0.7.1.linux-amd64 to /opt/hostedtoolcache first and then copy to /home/runner/.docker/cli-plugins"
Fri, 03 Dec 2021 21:53:15 GMT
  node -e "fs=require('fs'); fs.copyFileSync('./buildx-v0.7.1.linux-amd64', '/opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx5'); "
Fri, 03 Dec 2021 21:53:15 GMT
  ls -l /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx5
Fri, 03 Dec 2021 21:53:15 GMT
  node -e "fs=require('fs'); fs.copyFileSync('/opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx5', '/home/runner/.docker/cli-plugins/docker-buildx6'); "
Fri, 03 Dec 2021 21:53:15 GMT
  ls -l /home/runner/.docker/cli-plugins/docker-buildx6
Fri, 03 Dec 2021 21:53:15 GMT
  shell: /usr/bin/bash -e {0}
Fri, 03 Dec 2021 21:53:15 GMT
runner
Fri, 03 Dec 2021 21:53:15 GMT
try download and copy with curl and cp
Fri, 03 Dec 2021 21:53:15 GMT
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
Fri, 03 Dec 2021 21:53:15 GMT
                                 Dload  Upload   Total   Spent    Left  Speed
Fri, 03 Dec 2021 21:53:15 GMT

Fri, 03 Dec 2021 21:53:16 GMT
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
Fri, 03 Dec 2021 21:53:16 GMT
100   664  100   664    0     0   5487      0 --:--:-- --:--:-- --:--:--  5487
Fri, 03 Dec 2021 21:53:16 GMT

Fri, 03 Dec 2021 21:53:16 GMT
 28 44.9M   28 13.0M    0     0  33.8M      0  0:00:01 --:--:--  0:00:01 33.8M
Fri, 03 Dec 2021 21:53:16 GMT
100 44.9M  100 44.9M    0     0  65.6M      0 --:--:-- --:--:-- --:--:--  107M
Fri, 03 Dec 2021 21:53:16 GMT
'./buildx-v0.7.1.linux-amd64' -> '/opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx'
Fri, 03 Dec 2021 21:53:16 GMT
'/opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx' -> '/home/runner/.docker/cli-plugins/docker-buildx'
Fri, 03 Dec 2021 21:53:16 GMT
-rw-r--r-- 1 runner runner 47112192 Dec  3 21:53 /home/runner/.docker/cli-plugins/docker-buildx
Fri, 03 Dec 2021 21:53:16 GMT
v16.13.1
Fri, 03 Dec 2021 21:53:16 GMT
try use node to copy from /opt/hostedtoolcache to /home/runner/.docker/cli-plugins
Fri, 03 Dec 2021 21:53:16 GMT
-rw-r--r-- 1 runner runner 47112192 Dec  3 21:53 /home/runner/.docker/cli-plugins/docker-buildx2
Fri, 03 Dec 2021 21:53:16 GMT
try use node to copy from ./buildx-v0.7.1.linux-amd64 to /opt/hostedtoolcache first and then use cp to copy to /home/runner/.docker/cli-plugins
Fri, 03 Dec 2021 21:53:16 GMT
-rw-r--r-- 1 runner runner 47112192 Dec  3 21:53 /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx3
Fri, 03 Dec 2021 21:53:16 GMT
'/opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx3' -> '/home/runner/.docker/cli-plugins/docker-buildx4'
Fri, 03 Dec 2021 21:53:16 GMT
-rw-r--r-- 1 runner runner 47112192 Dec  3 21:53 /home/runner/.docker/cli-plugins/docker-buildx4
Fri, 03 Dec 2021 21:53:16 GMT
try use node to copy from ./buildx-v0.7.1.linux-amd64 to /opt/hostedtoolcache first and then copy to /home/runner/.docker/cli-plugins
Fri, 03 Dec 2021 21:53:17 GMT
-rw-r--r-- 1 runner runner 47112192 Dec  3 21:53 /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx5
Fri, 03 Dec 2021 21:53:54 GMT
Error: The operation was canceled.
crazy-max commented 2 years ago

@TingluoHuang

it's related to the double copy... 😵‍💫

Ok so cleaning dest beforehand should mitigate this issue in the meantime.

maxnowack commented 2 years ago

I've tried to downgrade to summerwind/actions-runner:v2.284.0-ubuntu-20.04, but it doesn't work. Is there a workaround for this issue?

rjhenry commented 2 years ago

I've tried to downgrade to summerwind/actions-runner:v2.284.0-ubuntu-20.04, but it doesn't work. Is there a workaround for this issue?

Yeah, unfortunately the runners self-update. I'm running a very hacky loop on my local machine to keep pipelines running for myself - fortunately, I'm the one triggering most of these pipelines so it's only me inconvenienced.

while true; do
  for pod in $(k get po -n github-actions-runner -l runner-deployment-name=assureddt -o jsonpath='{.items[*].metadata.name}'); do
    k exec -it -n github-actions-runner ${pod} -c runner -- rm -r /opt/hostedtoolcache/buildx/
  done
  sleep 60
done

It's dirty and I'm not proud of it, but it is working.

crazy-max commented 2 years ago

@maxnowack https://github.com/docker/setup-buildx-action/issues/117#issuecomment-984664730 should mitigate your issue in the meantime.

crazy-max commented 2 years ago

Should be solved since https://github.com/actions/runner/releases/tag/v2.285.1

maxnowack commented 2 years ago

Should be solved since actions/runner@v2.285.1 (release)

I can confirm, that it is resolved with the new runner version 🙂

novascreen commented 2 years ago

we are now running into this issue since our runners updated to v2.288.0

the extra Install Docker Buildx step described earlier does not help

crazy-max commented 2 years ago

@novascreen Must be linked to https://github.com/actions/runner/pull/1651 which updates node 12 with the same issue. Suggest to open an issue on https://github.com/actions/runner. cc @TingluoHuang @thboop

TingluoHuang commented 2 years ago

@novascreen what Linux kernel version your environment is on? there was a kernel bug get patched in Linux 5.4 which break node.js

You can read more from https://github.com/actions/runner/issues/1536

thril commented 2 years ago

FYI, this is also happening on runner version 2.288.1, which I believe is expected from what I've read here and in the linked issues. We're using a self-hosted runner and this job is using container: node:14.17.6.

In the "Docker info" section for docker/setup-buildx-action@v1, it does display a warning:

WARNING: Plugin "/github/home/.docker/cli-plugins/docker-buildx" is not valid: failed to fetch metadata: fork/exec /github/home/.docker/cli-plugins/docker-buildx: permission denied

Not sure if that is significant or not. Then on the Download and install buildx section it hangs. This issue is now blocking our deploys. 😢

dresselm commented 2 years ago

@crazy-max @thril

This latest GHA runner, 2.288.1, has a NODE update again (https://github.com/actions/runner/releases/tag/v2.288.1, https://github.com/actions/runner/pull/1651) and we are seeing the same issue as before. This change is slowly propagating out as it did last time and some of our runners are running 2.287.1 and some have the latest.

Theoretically, GHA-folk believe that since the Kernel issue is fixed, they are good to go with the fs.copyFileSync