Closed dresselm closed 2 years ago
I just ran into the same issue, as I attempt a new setup using private hosted runners. I thought it was related to running this in k8s, but this issue describes the exact same behavior. Downloading...
and it just hangs there.
I'd also like to confirm that when I switch to v2.284.0
this works.
apiVersion: actions.summerwind.dev/v1alpha1
kind: Runner
metadata:
name: example-runner
namespace: actions-runner-system
spec:
repository: <redacted>
image: summerwind/actions-runner:v2.284.0-ubuntu-20.04
env: []
@ghostsquad when you make this change (summerwind/actions-runner:v2.284.0-ubuntu-20.04
) to the runner, it uses the older node version (v12.22.7
) here instead of the updated node version (v12.13.1
)? I haven't tried this yet, but will check it out tomorrow. I have been shelling into the runner and checking this: /runner/externals/node12/bin/node -v
and it keeps using the new version and failing.
Another interesting thing I discovered is that this action runs runs using v12 but the latest GitHub runner (v2.285.0) shipped with v16
support as well.
@dresselm
Another interesting thing I discovered is that this action runs runs using v12 but the latest GitHub runner (v2.285.0) shipped with
v16
support as well.
Interesting indeed, are you on GHE?
Our e2e workflow works fine with runner version: '2.285.0'
:
Are you using a self-hosted runner?
We're seeing the same error on self-hosted runners. Version auto-changed overnight and now the job is timing out on the uses: docker/setup-buildx-action@v1.5.0
step:
WARNING: Plugin "/home/runner/.docker/cli-plugins/docker-buildx" is not valid: failed to fetch metadata: fork/exec /home/runner/.docker/cli-plugins/docker-buildx: permission denied
WARNING: No swap limit support
Download and install buildx
Error: The operation was canceled.
Yet another "same here". Self-hosted k8s runners on a private repo stuck forever at the buildx step.
Seeing the same here, first noted on Tuesday afternoon (UK time)
Can someone provide the workflow logs of an affected runner please (private info redacted ofc)? Thanks.
@crazy-max
Set-up job logs:
2021-12-02T13:24:49.5232557Z Current runner version: '2.285.0'
2021-12-02T13:24:49.5238376Z Runner name: 'shared-vjx25-5cz76'
2021-12-02T13:24:49.5239071Z Runner group name: 'Default'
2021-12-02T13:24:49.5240198Z Machine name: 'shared-vjx25-5cz76'
2021-12-02T13:24:49.5243337Z ##[group]GITHUB_TOKEN Permissions
2021-12-02T13:24:49.5244528Z Actions: write
2021-12-02T13:24:49.5245038Z Checks: write
2021-12-02T13:24:49.5245509Z Contents: write
2021-12-02T13:24:49.5246054Z Deployments: write
2021-12-02T13:24:49.5246576Z Discussions: write
2021-12-02T13:24:49.5247045Z Issues: write
2021-12-02T13:24:49.5247521Z Metadata: read
2021-12-02T13:24:49.5247996Z Packages: write
2021-12-02T13:24:49.5248538Z Pages: write
2021-12-02T13:24:49.5249020Z PullRequests: write
2021-12-02T13:24:49.5249610Z RepositoryProjects: write
2021-12-02T13:24:49.5250185Z SecurityEvents: write
2021-12-02T13:24:49.5250721Z Statuses: write
2021-12-02T13:24:49.5251310Z ##[endgroup]
2021-12-02T13:24:49.5253761Z Secret source: Actions
2021-12-02T13:24:49.5254650Z Prepare workflow directory
2021-12-02T13:24:49.6124677Z Prepare all required actions
2021-12-02T13:24:49.6133014Z Getting action download info
2021-12-02T13:24:49.9042420Z Download action repository 'actions/checkout@v2' (SHA:ec3a7ce113134d7a93b817d10a8272cb61118579)
2021-12-02T13:24:50.3898900Z Download action repository 'actions/setup-node@v2.2.0' (SHA:38d90ce44d5275ad62cc48384b3d8a58c500bb5f)
2021-12-02T13:24:50.9495641Z Download action repository 'actions/cache@v2' (SHA:937d24475381cd9c75ae6db12cb4e79714b926ed)
2021-12-02T13:24:51.4209320Z Download action repository 'azure/CLI@v1' (SHA:7378ce2ca3c38b4b063feb7a4cbe384fef978055)
2021-12-02T13:24:51.9278970Z Download action repository 'docker/login-action@v1.10.0' (SHA:f054a8b539a109f9f41c372932f1ae047eff08c9)
2021-12-02T13:24:52.2985638Z Download action repository 'docker/setup-buildx-action@v1.5.0' (SHA:e673438944759779e411a0f7ceef3ba437dccfa0)
2021-12-02T13:24:52.7750364Z Download action repository 'docker/build-push-action@v2.6.1' (SHA:1bc1040caef9e604eb543693ba89b5bf4fc80935)
docker/setup-buildx-action@v1.5.0
logs:
2021-12-02T13:26:53.7736865Z ##[group]Run docker/setup-buildx-action@v1.5.0
2021-12-02T13:26:53.7737409Z with:
2021-12-02T13:26:53.7737792Z driver: docker
2021-12-02T13:26:53.7738728Z buildkitd-flags: --allow-insecure-entitlement security.insecure --allow-insecure-entitlement network.host
2021-12-02T13:26:53.7739669Z install: false
2021-12-02T13:26:53.7740122Z use: true
2021-12-02T13:26:53.7740488Z env:
2021-12-02T13:26:53.7741443Z EXPOSE_DOMAIN_NAME: <redacted>
2021-12-02T13:26:53.7742918Z ACR_PREVIEW: <redacted>
2021-12-02T13:26:53.7743756Z VERSION: pr-3-18
2021-12-02T13:26:53.7886312Z TOKEN: ***
2021-12-02T13:26:53.7886738Z ##[endgroup]
2021-12-02T13:26:53.8429557Z ##[group]Docker info
2021-12-02T13:26:53.8458069Z [command]/usr/local/bin/docker version
2021-12-02T13:26:53.8912983Z Client:
2021-12-02T13:26:53.8915314Z Version: 20.10.8
2021-12-02T13:26:53.8915801Z API version: 1.41
2021-12-02T13:26:53.8918136Z Go version: go1.16.6
2021-12-02T13:26:53.8918888Z Git commit: 3967b7d
2021-12-02T13:26:53.8919435Z Built: Fri Jul 30 19:50:40 2021
2021-12-02T13:26:53.8921369Z OS/Arch: linux/amd64
2021-12-02T13:26:53.8921845Z Context: default
2021-12-02T13:26:53.8922455Z Experimental: true
2021-12-02T13:26:53.8923729Z
2021-12-02T13:26:53.8925459Z Server: Docker Engine - Community
2021-12-02T13:26:53.8926831Z Engine:
2021-12-02T13:26:53.8927325Z Version: 20.10.9
2021-12-02T13:26:53.8928714Z API version: 1.41 (minimum version 1.12)
2021-12-02T13:26:53.8929542Z Go version: go1.16.8
2021-12-02T13:26:53.8930283Z Git commit: 79ea9d3
2021-12-02T13:26:53.8930946Z Built: Mon Oct 4 16:07:30 2021
2021-12-02T13:26:53.8931662Z OS/Arch: linux/amd64
2021-12-02T13:26:53.8932425Z Experimental: false
2021-12-02T13:26:53.8933640Z containerd:
2021-12-02T13:26:53.8934337Z Version: v1.4.11
2021-12-02T13:26:53.8935253Z GitCommit: 5b46e404f6b9f661a205e28d59c982d3634148f8
2021-12-02T13:26:53.8936558Z runc:
2021-12-02T13:26:53.8937150Z Version: 1.0.2
2021-12-02T13:26:53.8938645Z GitCommit: v1.0.2-0-g52b36a2d
2021-12-02T13:26:53.8939681Z docker-init:
2021-12-02T13:26:53.8940315Z Version: 0.19.0
2021-12-02T13:26:53.8941084Z GitCommit: de40ad0
2021-12-02T13:26:53.8952687Z [command]/usr/local/bin/docker info
2021-12-02T13:26:53.9405279Z Client:
2021-12-02T13:26:53.9406048Z Context: default
2021-12-02T13:26:53.9406650Z Debug Mode: false
2021-12-02T13:26:53.9407019Z
2021-12-02T13:26:53.9407499Z Server:
2021-12-02T13:26:53.9407932Z Containers: 0
2021-12-02T13:26:53.9408407Z Running: 0
2021-12-02T13:26:53.9408826Z Paused: 0
2021-12-02T13:26:53.9409270Z Stopped: 0
2021-12-02T13:26:53.9409669Z Images: 1
2021-12-02T13:26:53.9410099Z Server Version: 20.10.9
2021-12-02T13:26:53.9410593Z Storage Driver: overlay2
2021-12-02T13:26:53.9411109Z Backing Filesystem: extfs
2021-12-02T13:26:53.9411612Z Supports d_type: true
2021-12-02T13:26:53.9412103Z Native Overlay Diff: false
2021-12-02T13:26:53.9412589Z userxattr: false
2021-12-02T13:26:53.9413999Z Logging Driver: json-file
2021-12-02T13:26:53.9414541Z Cgroup Driver: cgroupfs
2021-12-02T13:26:53.9415026Z Cgroup Version: 1
2021-12-02T13:26:53.9415474Z Plugins:
2021-12-02T13:26:53.9415882Z Volume: local
2021-12-02T13:26:53.9416428Z Network: bridge host ipvlan macvlan null overlay
2021-12-02T13:26:53.9417424Z Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
2021-12-02T13:26:53.9418092Z Swarm: inactive
2021-12-02T13:26:53.9418942Z Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
2021-12-02T13:26:53.9419836Z Default Runtime: runc
2021-12-02T13:26:53.9420492Z Init Binary: docker-init
2021-12-02T13:26:53.9421142Z containerd version: 5b46e404f6b9f661a205e28d59c982d3634148f8
2021-12-02T13:26:53.9421911Z runc version: v1.0.2-0-g52b36a2d
2021-12-02T13:26:53.9422417Z init version: de40ad0
2021-12-02T13:26:53.9423218Z Security Options:
2021-12-02T13:26:53.9423614Z apparmor
2021-12-02T13:26:53.9423999Z seccomp
2021-12-02T13:26:53.9424432Z Profile: default
2021-12-02T13:26:53.9425035Z Kernel Version: 5.4.0-1063-azure
2021-12-02T13:26:53.9425667Z Operating System: Alpine Linux v3.14 (containerized)
2021-12-02T13:26:53.9426213Z OSType: linux
2021-12-02T13:26:53.9426628Z Architecture: x86_64
2021-12-02T13:26:53.9427054Z CPUs: 8
2021-12-02T13:26:53.9427474Z Total Memory: 31.36GiB
2021-12-02T13:26:53.9428268Z Name: shared-vjx25-5cz76
2021-12-02T13:26:53.9428857Z ID: MKF2:72OS:YKOV:2E5D:635C:TQSY:YW63:FN5L:JMGK:HTZQ:X6CY:C4XG
2021-12-02T13:26:53.9429427Z Docker Root Dir: /var/lib/docker
2021-12-02T13:26:53.9429926Z Debug Mode: false
2021-12-02T13:26:53.9430539Z Registry: https://index.docker.io/v1/
2021-12-02T13:26:53.9431080Z Labels:
2021-12-02T13:26:53.9431509Z Experimental: false
2021-12-02T13:26:53.9432013Z Insecure Registries:
2021-12-02T13:26:53.9432402Z 127.0.0.0/8
2021-12-02T13:26:53.9432838Z Live Restore Enabled: false
2021-12-02T13:26:53.9433380Z Product License: Community Engine
2021-12-02T13:26:53.9433715Z
2021-12-02T13:26:53.9434164Z WARNING: No swap limit support
2021-12-02T13:26:53.9435253Z ##[endgroup]
2021-12-02T13:26:53.9851668Z ##[group]Download and install buildx
2021-12-02T13:26:54.1115979Z Downloading https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64
2021-12-02T13:38:51.7240440Z ##[error]The operation was canceled.
@jnsvd I'm really confused. Can you add this step before the setup-buildx-action
step to see if it's linked to the actions/tool-cache
pkg that downloads buildx?:
- name: Install Docker Buildx
run: |
mkdir -p "$HOME/.docker/cli-plugins"
curl -SsL "https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64" -o "$HOME/.docker/cli-plugins/docker-buildx"
chmod +x "$HOME/.docker/cli-plugins/docker-buildx"
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
@crazy-max If we add the extra Install Docker Buildx
step the pipeline finished successfully.
@crazy-max Step actions/cache@v2
doesn't seem to be related to the problem, when removing it (and your additional step), the pipeline is still freezing on the docker/setup-buildx-action
step. When applying your step and the actions/cache@v2
the pipeline ran to completion.
I can confirm that the Install Docker Buildx
step above does allow the setup-buildx-action
to run successfully.
Ok thanks for your feedback! One last thing, can you add the ACTIONS_RUNNER_DEBUG
secret to true
and give me the output? It could tell us if this is an issue with the actions/tool-cache
.
@jnsvd
Step
actions/cache@v2
doesn't seem to be related to the problem
This is not about the actions/cache
action but the node module actions/tool-cache
we use in the setup-buildx-action
: https://github.com/actions/toolkit/tree/main/packages/tool-cache.
Let me know if you need more logs than this, but you can see that the fixed version goes to Creating a new builder instance
instead of getting stuck on Download and install buildx
The operation was canceled.
is misleading. The action never times out. That message appears when you cancel the workflow.
Ok thanks for your feedback! One last thing, can you add the
ACTIONS_RUNNER_DEBUG
secret totrue
and give me the output? It could tell us if this is an issue with theactions/tool-cache
.
There's no more output when enabling diagnostic logging. The pipeline really seems to get stuck on Downloading https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64
somehow.
Would it be possible to add more logging for each of the steps? That might help future debugging
@jnsvd
There's no more output when enabling diagnostic logging. The pipeline really seems to get stuck on
Downloading https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64
somehow.
Oh my bad it's ACTIONS_STEP_DEBUG
to true
and you should have smth like this:
Friendly ping @thboop @luketomlinson. Are you aware of some issues with the actions/tool-cache
module on self-hosted runners updated to the latest v2.285.0
?
We use the downloadTool
and cacheFile
functions here from actions/tool-cache
and it seems to be frozen while caching. Any idea?
@crazy-max @richardpeng
Regarding adding more debug logging, I was stuck yesterday even though I had ACTIONS_STEP_DEBUG=true
setup correctly. Primary reason was that I saw the last line Plugin path...
but wasn't seeing Fixing perms
and there was no additional debug details describing what was happening while it was stuck.
Github is slowly rolling out updates to their runners so it took many tries to get a run on an upgraded self-hosted runner....
Here are the logs:
There are no further logs. It's just stuck at that spot
Update:
It looks like this line is the culprit:
I wonder if this is not a bug with fs.copyFileSync
in the node version 12.22.7 added in v2.285.0 of the runner as shown in https://github.com/nodejs/node/issues/40200
No specific reports on your side @TingluoHuang @thboop?
@crazy-max This is the first report we got since we just bump the node12 packaged with the runner (v2.285.0) to that latest v12 LTS version. The issue you linked is for only node v16, isn't it?
@TingluoHuang
@crazy-max This is the first report we got since we just bump the node12 packaged with the runner (v2.285.0) to that latest v12 LTS version. The issue you linked is for only node v16, isn't it?
Yes indeed I try to find the root cause maybe it has been backported to 12.22.7 I'm not quite sure but anyway I tried again on a fresh runner 2.285.0 and can't repro (Ubuntu 20.04):
@richardpeng @dresselm @jnsvd @rjhenry @ghostsquad and others with this issue, can you show your Set up job
logs like my screenshot? I wonder if the virtual env image should not been updated on your side. And also is this happening only on self-hosted runners for you all?
I think it might have something to do with the docker container volume mount. I am playing with it right now.
@TingluoHuang Thanks!
Current runner version: '2.285.0'
Runner name: 'example-runner'
Runner group name: 'Default'
Machine name: 'example-runner'
GITHUB_TOKEN Permissions
Secret source: Actions
Prepare workflow directory
Prepare all required actions
Getting action download info
Download action repository 'actions/checkout@v2' (SHA:ec3a7ce113134d7a93b817d10a8272cb61118579)
Download action repository 'asdf-vm/actions@v1' (SHA:707e84f3ee349548310aeabdad0dd3bfcb9b69fa)
Download action repository 'docker/setup-buildx-action@v1' (SHA:94ab11c41e45d028884a99163086648e898eed25)
Download action repository 'actions/setup-go@v2' (SHA:331ce1d993939866bb63c32c6cbbfd48fa76fc57)
Download action repository 'actions/cache@v2' (SHA:937d24475381cd9c75ae6db12cb4e79714b926ed)
I'm not sure how useful this actually, is, since this was a private runner, it was using this image:
summerwind/actions-runner:v2.285.0-ubuntu-20.04
Can anyone with repro environment set environment NODE_DEBUG=fs
when run the actions and share the log?
I can't repro with just docker + summerwind/actions-runner:v2.285.0-ubuntu-20.04
runner@fv-az43-260:~/work/ting-test/ting-test$ docker run -it --rm summerwind/actions-runner:v2.285.0-ubuntu-20.04 bash
Unable to find image 'summerwind/actions-runner:v2.285.0-ubuntu-20.04' locally
v2.285.0-ubuntu-20.04: Pulling from summerwind/actions-runner
7b1a6ab2e44d: Already exists
8dccfeffb9d5: Pull complete
677f2ea3860b: Pull complete
a7e2a0869a62: Pull complete
3a85d00f8ff6: Pull complete
578b8578fd95: Pull complete
c4b1e29c1c08: Pull complete
ae9503151e38: Pull complete
Digest: sha256:d08b7acb0f31c37ee50383a8a37164bb7401e1f88a7e7cff879934bf2f8c675a
Status: Downloaded newer image for summerwind/actions-runner:v2.285.0-ubuntu-20.04
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.
runner@da451f87281b:/$ ls
bin boot dev entrypoint.sh etc home lib lib32 lib64 libx32 media mnt opt proc root run runnertmp sbin srv sys tmp usr var
runner@da451f87281b:/$ cd runnertmp/
runner@da451f87281b:/runnertmp$ ls
bin config.sh env.sh externalstmp patched run.sh
runner@da451f87281b:/runnertmp$ curl -L -O https://github.com/TingluoHuang/setup-buildx-action/archive/refs/heads/master.tar.gz
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 151 100 151 0 0 1935 0 --:--:-- --:--:-- --:--:-- 1935
100 244k 0 244k 0 0 1111k 0 --:--:-- --:--:-- --:--:-- 1111k
runner@da451f87281b:/runnertmp$ sudo tar xzf ./master.tar.gz
runner@da451f87281b:/runnertmp$ ls
bin config.sh env.sh externalstmp master.tar.gz patched run.sh setup-buildx-action-master
runner@da451f87281b:/runnertmp$ cd setup-buildx-action-master/
runner@da451f87281b:/runnertmp/setup-buildx-action-master$ ls
LICENSE README.md __tests__ action.yml codecov.yml dist docker-bake.hcl hack jest.config.js package-lock.json package.json src tsconfig.json yarn.lock
runner@da451f87281b:/runnertmp$ mkdir tmp
runner@da451f87281b:/runnertmp$ export RUNNER_TEMP=/run
run/ runnertmp/
runner@da451f87281b:/runnertmp$ export RUNNER_TEMP=/runnertmp/tmp/
runner@da451f87281b:/runnertmp$ ./externalstmp/node12/bin/node ./setup-buildx-action-master/dist/index.js
::group::Download and install buildx
::debug::Release v0.7.1 found
::debug::isExplicit: 0.7.1
::debug::explicit? true
::debug::checking cache: /opt/hostedtoolcache/buildx/0.7.1/x64
::debug::not found
Downloading https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64
::debug::Downloading https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64
::debug::Destination /runnertmp/tmp/014dd3b9-91d0-4d90-9824-9b860735f619
::debug::download complete
::debug::Downloaded to /runnertmp/tmp/014dd3b9-91d0-4d90-9824-9b860735f619
::debug::Caching tool buildx 0.7.1 x64
::debug::source file: /runnertmp/tmp/014dd3b9-91d0-4d90-9824-9b860735f619
::debug::destination /opt/hostedtoolcache/buildx/0.7.1/x64
::debug::destination file /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx
::debug::finished caching tool
::debug::Plugins dir is /home/runner/.docker/cli-plugins
::debug::Plugin path is /home/runner/.docker/cli-plugins/docker-buildx
Fixing perms
::endgroup::
runner@da451f87281b:/runnertmp$
Might have something to do with Kubernetes, trying it out...
Also can't repro with minikube on my ubuntu VM...
runner@fv-az43-260:~/work/ting-test/ting-test$ cat <<EOF | kubectl apply -f -
> apiVersion: v1
> kind: Pod
> metadata:
> name: command-demo
> spec:
> containers:
> - name: command-demo-container
> image: summerwind/actions-runner:v2.285.0-ubuntu-20.04
> command: ["sleep"]
> args: ["100000"]
> EOF
pod/command-demo created
runner@fv-az43-260:~/work/ting-test/ting-test$ kubectl exec -it command-demo bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.
runner@command-demo:/$ ls
bin boot dev entrypoint.sh etc home lib lib32 lib64 libx32 media mnt opt proc root run runnertmp sbin srv sys tmp usr var
runner@command-demo:/$ cd runnertmp/
runner@command-demo:/runnertmp$ curl -L -O https://github.com/TingluoHuang/setup-buildx-action/archive/refs/heads/master.tar.gz
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 151 100 151 0 0 1451 0 --:--:-- --:--:-- --:--:-- 1451
100 244k 0 244k 0 0 1010k 0 --:--:-- --:--:-- --:--:-- 1010k
runner@command-demo:/runnertmp$ tar xcf ./master.tar.gz
tar: You may not specify more than one '-Acdtrux', '--delete' or '--test-label' option
Try 'tar --help' or 'tar --usage' for more information.
runner@command-demo:/runnertmp$ tar xzf ./master.tar.gz
runner@command-demo:/runnertmp$ mkdir tmp
runner@command-demo:/runnertmp$ export RUNNER_TEMP=/runnertmp/tmp
runner@command-demo:/runnertmp$ ./externalstmp/node12/bin/node ./setup-buildx-action-master/dist/index.js
::group::Download and install buildx
::debug::Release v0.7.1 found
::debug::isExplicit: 0.7.1
::debug::explicit? true
::debug::checking cache: /opt/hostedtoolcache/buildx/0.7.1/x64
::debug::not found
Downloading https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64
::debug::Downloading https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64
::debug::Destination /runnertmp/tmp/7eb2f149-3be0-427d-b3e4-42898e526a31
::debug::download complete
::debug::Downloaded to /runnertmp/tmp/7eb2f149-3be0-427d-b3e4-42898e526a31
::debug::Caching tool buildx 0.7.1 x64
::debug::source file: /runnertmp/tmp/7eb2f149-3be0-427d-b3e4-42898e526a31
::debug::destination /opt/hostedtoolcache/buildx/0.7.1/x64
::debug::destination file /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx
::debug::finished caching tool
::debug::Plugins dir is /home/runner/.docker/cli-plugins
::debug::Plugin path is /home/runner/.docker/cli-plugins/docker-buildx
Fixing perms
::endgroup::
runner@command-demo:/runnertmp$
Can anyone with repro environment try what I have run and see if they get a repro?
BTW, I forked the repro and remove all other code and only keep the part of downloading and filecopy in https://github.com/TingluoHuang/setup-buildx-action
.Can anyone with repro environment try what I have run and see if they get a repro?
Could not reproduce it your way on a failing self-hosted runner (v2.285.0-ubuntu-20.04)
@richardpeng @dresselm @jnsvd @rjhenry @ghostsquad and others with this issue, can you show your Set up job logs like my screenshot? I wonder if the virtual env image should not been updated on your side. And also is this happening only on self-hosted runners for you all?
@crazy-max I already shared my Set up job
logs earlier, note that it doesn't provide any information on a virtual environment. Probably not something shown for self-hosted runners? There's not much info in that section besides the runner's version.
Can anyone with repro environment set environment NODE_DEBUG=fs when run the actions and share the log?
@TingluoHuang I'm getting the same output that @richardpeng is seeing:
name: Setup Docker buildx
uses: docker/setup-buildx-action@v1.6.0
with:
driver: docker
env:
NODE_DEBUG: fs
2021-12-03T09:02:35.4142614Z ##[debug]Evaluating condition for step: 'Setup Docker buildx'
2021-12-03T09:02:35.4144259Z ##[debug]Evaluating: success()
2021-12-03T09:02:35.4145026Z ##[debug]Evaluating success:
2021-12-03T09:02:35.4145772Z ##[debug]=> true
2021-12-03T09:02:35.4146536Z ##[debug]Result: true
2021-12-03T09:02:35.4147438Z ##[debug]Starting: Setup Docker buildx
2021-12-03T09:02:35.4167907Z ##[debug]Register post job cleanup for action: docker/setup-buildx-action@v1.6.0
2021-12-03T09:02:35.4181341Z ##[debug]Loading inputs
2021-12-03T09:02:35.4186780Z ##[debug]Loading env
2021-12-03T09:02:35.4193642Z ##[group]Run docker/setup-buildx-action@v1.6.0
2021-12-03T09:02:35.4194118Z with:
2021-12-03T09:02:35.4194462Z driver: docker
2021-12-03T09:02:35.4195352Z buildkitd-flags: --allow-insecure-entitlement security.insecure --allow-insecure-entitlement network.host
2021-12-03T09:02:35.4196216Z install: false
2021-12-03T09:02:35.4196660Z use: true
2021-12-03T09:02:35.4197197Z env:
2021-12-03T09:02:35.4198282Z EXPOSE_DOMAIN_NAME: <redacted>
2021-12-03T09:02:35.4200801Z ACR_PREVIEW: <redacted>
2021-12-03T09:02:35.4201593Z VERSION: pr-4-26
2021-12-03T09:02:35.4310032Z TOKEN: ***
2021-12-03T09:02:35.4310414Z NODE_DEBUG: fs
2021-12-03T09:02:35.4310780Z ##[endgroup]
2021-12-03T09:02:35.4866965Z ::save-state name=isPost::true
2021-12-03T09:02:35.4867592Z ##[debug]Save intra-action state isPost = true
2021-12-03T09:02:35.4871957Z ::group::Docker info
2021-12-03T09:02:35.4872395Z ##[group]Docker info
2021-12-03T09:02:35.4901885Z [command]/usr/local/bin/docker version
2021-12-03T09:02:35.5365418Z Client:
2021-12-03T09:02:35.5368133Z Version: 20.10.8
2021-12-03T09:02:35.5368703Z API version: 1.41
2021-12-03T09:02:35.5369258Z Go version: go1.16.6
2021-12-03T09:02:35.5369795Z Git commit: 3967b7d
2021-12-03T09:02:35.5370370Z Built: Fri Jul 30 19:50:40 2021
2021-12-03T09:02:35.5370928Z OS/Arch: linux/amd64
2021-12-03T09:02:35.5371470Z Context: default
2021-12-03T09:02:35.5372035Z Experimental: true
2021-12-03T09:02:35.5372568Z
2021-12-03T09:02:35.5374074Z Server: Docker Engine - Community
2021-12-03T09:02:35.5374541Z Engine:
2021-12-03T09:02:35.5374922Z Version: 20.10.9
2021-12-03T09:02:35.5375395Z API version: 1.41 (minimum version 1.12)
2021-12-03T09:02:35.5375877Z Go version: go1.16.8
2021-12-03T09:02:35.5376301Z Git commit: 79ea9d3
2021-12-03T09:02:35.5376743Z Built: Mon Oct 4 16:07:30 2021
2021-12-03T09:02:35.5377209Z OS/Arch: linux/amd64
2021-12-03T09:02:35.5377848Z Experimental: false
2021-12-03T09:02:35.5378282Z containerd:
2021-12-03T09:02:35.5378685Z Version: v1.4.11
2021-12-03T09:02:35.5379277Z GitCommit: 5b46e404f6b9f661a205e28d59c982d3634148f8
2021-12-03T09:02:35.5379791Z runc:
2021-12-03T09:02:35.5380419Z Version: 1.0.2
2021-12-03T09:02:35.5381075Z GitCommit: v1.0.2-0-g52b36a2d
2021-12-03T09:02:35.5381619Z docker-init:
2021-12-03T09:02:35.5382023Z Version: 0.19.0
2021-12-03T09:02:35.5382430Z GitCommit: de40ad0
2021-12-03T09:02:35.5399999Z [command]/usr/local/bin/docker info
2021-12-03T09:02:35.5851091Z Client:
2021-12-03T09:02:35.5851815Z Context: default
2021-12-03T09:02:35.5852354Z Debug Mode: false
2021-12-03T09:02:35.5852698Z
2021-12-03T09:02:35.5853153Z Server:
2021-12-03T09:02:35.5853983Z Containers: 0
2021-12-03T09:02:35.5854485Z Running: 0
2021-12-03T09:02:35.5854857Z Paused: 0
2021-12-03T09:02:35.5855223Z Stopped: 0
2021-12-03T09:02:35.5855569Z Images: 1
2021-12-03T09:02:35.5855964Z Server Version: 20.10.9
2021-12-03T09:02:35.5856412Z Storage Driver: overlay2
2021-12-03T09:02:35.5856894Z Backing Filesystem: extfs
2021-12-03T09:02:35.5857352Z Supports d_type: true
2021-12-03T09:02:35.5857808Z Native Overlay Diff: false
2021-12-03T09:02:35.5858272Z userxattr: false
2021-12-03T09:02:35.5859299Z Logging Driver: json-file
2021-12-03T09:02:35.5859984Z Cgroup Driver: cgroupfs
2021-12-03T09:02:35.5860411Z Cgroup Version: 1
2021-12-03T09:02:35.5860779Z Plugins:
2021-12-03T09:02:35.5861231Z Volume: local
2021-12-03T09:02:35.5861737Z Network: bridge host ipvlan macvlan null overlay
2021-12-03T09:02:35.5862699Z Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
2021-12-03T09:02:35.5863338Z Swarm: inactive
2021-12-03T09:02:35.5864095Z Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
2021-12-03T09:02:35.5864825Z Default Runtime: runc
2021-12-03T09:02:35.5865383Z Init Binary: docker-init
2021-12-03T09:02:35.5866016Z containerd version: 5b46e404f6b9f661a205e28d59c982d3634148f8
2021-12-03T09:02:35.5866871Z runc version: v1.0.2-0-g52b36a2d
2021-12-03T09:02:35.5867419Z init version: de40ad0
2021-12-03T09:02:35.5867901Z Security Options:
2021-12-03T09:02:35.5868297Z apparmor
2021-12-03T09:02:35.5868665Z seccomp
2021-12-03T09:02:35.5869162Z Profile: default
2021-12-03T09:02:35.5869844Z Kernel Version: 5.4.0-1063-azure
2021-12-03T09:02:35.5870616Z Operating System: Alpine Linux v3.14 (containerized)
2021-12-03T09:02:35.5871356Z OSType: linux
2021-12-03T09:02:35.5874152Z Architecture: x86_64
2021-12-03T09:02:35.5874682Z CPUs: 8
2021-12-03T09:02:35.5875191Z Total Memory: 31.36GiB
2021-12-03T09:02:35.5876055Z Name: shared-6424v-hgmpk
2021-12-03T09:02:35.5876833Z ID: 2EHW:36OS:CJ6O:HM7O:K6BZ:5NP2:IBZD:OU4L:5526:ZJZH:J6PK:P445
2021-12-03T09:02:35.5877541Z Docker Root Dir: /var/lib/docker
2021-12-03T09:02:35.5878130Z Debug Mode: false
2021-12-03T09:02:35.5878874Z Registry: https://index.docker.io/v1/
2021-12-03T09:02:35.5879563Z Labels:
2021-12-03T09:02:35.5880106Z Experimental: false
2021-12-03T09:02:35.5880717Z Insecure Registries:
2021-12-03T09:02:35.5881227Z 127.0.0.0/8
2021-12-03T09:02:35.5881817Z Live Restore Enabled: false
2021-12-03T09:02:35.5882532Z Product License: Community Engine
2021-12-03T09:02:35.5882984Z
2021-12-03T09:02:35.5883597Z WARNING: No swap limit support
2021-12-03T09:02:35.5884988Z ::endgroup::
2021-12-03T09:02:35.5885460Z ##[endgroup]
2021-12-03T09:02:35.6560952Z ::group::Download and install buildx
2021-12-03T09:02:35.6561563Z ##[group]Download and install buildx
2021-12-03T09:02:35.9540593Z ##[debug]Release v0.7.1 found
2021-12-03T09:02:35.9546050Z ##[debug]isExplicit: 0.7.1
2021-12-03T09:02:35.9547154Z ##[debug]explicit? true
2021-12-03T09:02:35.9548339Z ##[debug]checking cache: /opt/hostedtoolcache/buildx/0.7.1/x64
2021-12-03T09:02:35.9549650Z ##[debug]not found
2021-12-03T09:02:35.9556881Z Downloading https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64
2021-12-03T09:02:35.9565690Z ##[debug]Downloading https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64
2021-12-03T09:02:35.9567565Z ##[debug]Destination /runner/_work/_temp/965ebd31-5089-455e-a609-c6ce7fcfd06f
2021-12-03T09:02:38.0875929Z ##[debug]download complete
2021-12-03T09:02:38.0877910Z ##[debug]Downloaded to /runner/_work/_temp/965ebd31-5089-455e-a609-c6ce7fcfd06f
2021-12-03T09:02:38.0878983Z ##[debug]Caching tool buildx 0.7.1 x64
2021-12-03T09:02:38.0880252Z ##[debug]source file: /runner/_work/_temp/965ebd31-5089-455e-a609-c6ce7fcfd06f
2021-12-03T09:02:38.0881406Z ##[debug]destination /opt/hostedtoolcache/buildx/0.7.1/x64
2021-12-03T09:02:38.0888387Z ##[debug]destination file /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx
2021-12-03T09:02:38.1196557Z ##[debug]finished caching tool
2021-12-03T09:02:38.1198430Z ##[debug]Plugins dir is /home/runner/.docker/cli-plugins
2021-12-03T09:02:38.1199742Z ##[debug]Plugin path is /home/runner/.docker/cli-plugins/docker-buildx
Seeing the same thing here with summewrinds controller on EKS, I was somehow able to unstuck it by removing /opt/hostedtoolcache/buildx/ on a runner showing the "Downloading" line.
Looks like this is an issue since node v12.19.0
v12.18.4 works fine...
base on https://nodejs.org/uk/blog/release/v12.19.0/
the highly suspicious change is deps: upgrade to libuv 1.39.0 (Colin Ihrig) #34915
which lines up with people discussed in https://github.com/nodejs/node/issues/40200
I opened a PR to switch to node 16 #118, can someone try with crazy-max/docker-setup-buildx-action@node16
?
@TingluoHuang Oh just in time thanks a lot for your feedback!
which lines up with people discussed in nodejs/node#40200
Yeah that was my first assumption in https://github.com/docker/setup-buildx-action/issues/117#issuecomment-984896179
node16 doesn't work, it hangs as well. 😢
it's related to the double copy... 😵💫
Fri, 03 Dec 2021 21:53:15 GMT
export NODE_DEBUG=fs
Fri, 03 Dec 2021 21:53:15 GMT
whoami
Fri, 03 Dec 2021 21:53:15 GMT
echo "try download and copy with curl and cp"
Fri, 03 Dec 2021 21:53:15 GMT
curl -O -L https://github.com/docker/buildx/releases/download/v0.7.1/buildx-v0.7.1.linux-amd64
Fri, 03 Dec 2021 21:53:15 GMT
mkdir -p /opt/hostedtoolcache/buildx/0.7.1/x64
Fri, 03 Dec 2021 21:53:15 GMT
cp -v ./buildx-v0.7.1.linux-amd64 /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx
Fri, 03 Dec 2021 21:53:15 GMT
mkdir -p /home/runner/.docker/cli-plugins
Fri, 03 Dec 2021 21:53:15 GMT
cp -v /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx /home/runner/.docker/cli-plugins/docker-buildx
Fri, 03 Dec 2021 21:53:15 GMT
ls -l /home/runner/.docker/cli-plugins/docker-buildx
Fri, 03 Dec 2021 21:53:15 GMT
Fri, 03 Dec 2021 21:53:15 GMT
node -v
Fri, 03 Dec 2021 21:53:15 GMT
echo "try use node to copy from /opt/hostedtoolcache to /home/runner/.docker/cli-plugins"
Fri, 03 Dec 2021 21:53:15 GMT
node -e "fs=require('fs'); fs.copyFileSync('/opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx', '/home/runner/.docker/cli-plugins/docker-buildx2'); "
Fri, 03 Dec 2021 21:53:15 GMT
#node ./setup-buildx-action-master/dist/index.js
Fri, 03 Dec 2021 21:53:15 GMT
ls -l /home/runner/.docker/cli-plugins/docker-buildx2
Fri, 03 Dec 2021 21:53:15 GMT
Fri, 03 Dec 2021 21:53:15 GMT
echo "try use node to copy from ./buildx-v0.7.1.linux-amd64 to /opt/hostedtoolcache first and then use cp to copy to /home/runner/.docker/cli-plugins"
Fri, 03 Dec 2021 21:53:15 GMT
node -e "fs=require('fs'); fs.copyFileSync('./buildx-v0.7.1.linux-amd64', '/opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx3'); "
Fri, 03 Dec 2021 21:53:15 GMT
ls -l /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx3
Fri, 03 Dec 2021 21:53:15 GMT
cp -v /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx3 /home/runner/.docker/cli-plugins/docker-buildx4
Fri, 03 Dec 2021 21:53:15 GMT
ls -l /home/runner/.docker/cli-plugins/docker-buildx4
Fri, 03 Dec 2021 21:53:15 GMT
Fri, 03 Dec 2021 21:53:15 GMT
echo "try use node to copy from ./buildx-v0.7.1.linux-amd64 to /opt/hostedtoolcache first and then copy to /home/runner/.docker/cli-plugins"
Fri, 03 Dec 2021 21:53:15 GMT
node -e "fs=require('fs'); fs.copyFileSync('./buildx-v0.7.1.linux-amd64', '/opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx5'); "
Fri, 03 Dec 2021 21:53:15 GMT
ls -l /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx5
Fri, 03 Dec 2021 21:53:15 GMT
node -e "fs=require('fs'); fs.copyFileSync('/opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx5', '/home/runner/.docker/cli-plugins/docker-buildx6'); "
Fri, 03 Dec 2021 21:53:15 GMT
ls -l /home/runner/.docker/cli-plugins/docker-buildx6
Fri, 03 Dec 2021 21:53:15 GMT
shell: /usr/bin/bash -e {0}
Fri, 03 Dec 2021 21:53:15 GMT
runner
Fri, 03 Dec 2021 21:53:15 GMT
try download and copy with curl and cp
Fri, 03 Dec 2021 21:53:15 GMT
% Total % Received % Xferd Average Speed Time Time Time Current
Fri, 03 Dec 2021 21:53:15 GMT
Dload Upload Total Spent Left Speed
Fri, 03 Dec 2021 21:53:15 GMT
Fri, 03 Dec 2021 21:53:16 GMT
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
Fri, 03 Dec 2021 21:53:16 GMT
100 664 100 664 0 0 5487 0 --:--:-- --:--:-- --:--:-- 5487
Fri, 03 Dec 2021 21:53:16 GMT
Fri, 03 Dec 2021 21:53:16 GMT
28 44.9M 28 13.0M 0 0 33.8M 0 0:00:01 --:--:-- 0:00:01 33.8M
Fri, 03 Dec 2021 21:53:16 GMT
100 44.9M 100 44.9M 0 0 65.6M 0 --:--:-- --:--:-- --:--:-- 107M
Fri, 03 Dec 2021 21:53:16 GMT
'./buildx-v0.7.1.linux-amd64' -> '/opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx'
Fri, 03 Dec 2021 21:53:16 GMT
'/opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx' -> '/home/runner/.docker/cli-plugins/docker-buildx'
Fri, 03 Dec 2021 21:53:16 GMT
-rw-r--r-- 1 runner runner 47112192 Dec 3 21:53 /home/runner/.docker/cli-plugins/docker-buildx
Fri, 03 Dec 2021 21:53:16 GMT
v16.13.1
Fri, 03 Dec 2021 21:53:16 GMT
try use node to copy from /opt/hostedtoolcache to /home/runner/.docker/cli-plugins
Fri, 03 Dec 2021 21:53:16 GMT
-rw-r--r-- 1 runner runner 47112192 Dec 3 21:53 /home/runner/.docker/cli-plugins/docker-buildx2
Fri, 03 Dec 2021 21:53:16 GMT
try use node to copy from ./buildx-v0.7.1.linux-amd64 to /opt/hostedtoolcache first and then use cp to copy to /home/runner/.docker/cli-plugins
Fri, 03 Dec 2021 21:53:16 GMT
-rw-r--r-- 1 runner runner 47112192 Dec 3 21:53 /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx3
Fri, 03 Dec 2021 21:53:16 GMT
'/opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx3' -> '/home/runner/.docker/cli-plugins/docker-buildx4'
Fri, 03 Dec 2021 21:53:16 GMT
-rw-r--r-- 1 runner runner 47112192 Dec 3 21:53 /home/runner/.docker/cli-plugins/docker-buildx4
Fri, 03 Dec 2021 21:53:16 GMT
try use node to copy from ./buildx-v0.7.1.linux-amd64 to /opt/hostedtoolcache first and then copy to /home/runner/.docker/cli-plugins
Fri, 03 Dec 2021 21:53:17 GMT
-rw-r--r-- 1 runner runner 47112192 Dec 3 21:53 /opt/hostedtoolcache/buildx/0.7.1/x64/docker-buildx5
Fri, 03 Dec 2021 21:53:54 GMT
Error: The operation was canceled.
@TingluoHuang
it's related to the double copy... 😵💫
Ok so cleaning dest beforehand should mitigate this issue in the meantime.
I've tried to downgrade to summerwind/actions-runner:v2.284.0-ubuntu-20.04
, but it doesn't work. Is there a workaround for this issue?
I've tried to downgrade to summerwind/actions-runner:v2.284.0-ubuntu-20.04, but it doesn't work. Is there a workaround for this issue?
Yeah, unfortunately the runners self-update. I'm running a very hacky loop on my local machine to keep pipelines running for myself - fortunately, I'm the one triggering most of these pipelines so it's only me inconvenienced.
while true; do
for pod in $(k get po -n github-actions-runner -l runner-deployment-name=assureddt -o jsonpath='{.items[*].metadata.name}'); do
k exec -it -n github-actions-runner ${pod} -c runner -- rm -r /opt/hostedtoolcache/buildx/
done
sleep 60
done
It's dirty and I'm not proud of it, but it is working.
@maxnowack https://github.com/docker/setup-buildx-action/issues/117#issuecomment-984664730 should mitigate your issue in the meantime.
Should be solved since https://github.com/actions/runner/releases/tag/v2.285.1
Should be solved since actions/runner@
v2.285.1
(release)
I can confirm, that it is resolved with the new runner version 🙂
we are now running into this issue since our runners updated to v2.288.0
the extra Install Docker Buildx
step described earlier does not help
@novascreen Must be linked to https://github.com/actions/runner/pull/1651 which updates node 12 with the same issue. Suggest to open an issue on https://github.com/actions/runner. cc @TingluoHuang @thboop
@novascreen what Linux kernel version your environment is on? there was a kernel bug get patched in Linux 5.4 which break node.js
You can read more from https://github.com/actions/runner/issues/1536
FYI, this is also happening on runner version 2.288.1
, which I believe is expected from what I've read here and in the linked issues. We're using a self-hosted runner and this job is using container: node:14.17.6
.
In the "Docker info" section for docker/setup-buildx-action@v1
, it does display a warning:
WARNING: Plugin "/github/home/.docker/cli-plugins/docker-buildx" is not valid: failed to fetch metadata: fork/exec /github/home/.docker/cli-plugins/docker-buildx: permission denied
Not sure if that is significant or not. Then on the Download and install buildx
section it hangs. This issue is now blocking our deploys. 😢
@crazy-max @thril
This latest GHA runner, 2.288.1
, has a NODE update again (https://github.com/actions/runner/releases/tag/v2.288.1, https://github.com/actions/runner/pull/1651) and we are seeing the same issue as before. This change is slowly propagating out as it did last time and some of our runners are running 2.287.1
and some have the latest.
Theoretically, GHA-folk believe that since the Kernel issue is fixed, they are good to go with the fs.copyFileSync
Behaviour
Our runners were auto-updated this morning from v2.284.0 to v2.285.0 and began to hang on the
Download and Install Buildx
step:Steps to reproduce this issue
docker/setup-buildx-action@v1
in your workflowExpected behaviour
Actual behaviour
Configuration
Note: all steps prior to the failing step pass as expected
Logs
Our logs are littered with private, implementation details. Could you help me isolate which part of the logs you need for diagnosis?