Open Zagitta opened 4 years ago
THe same issue:
worker_1 | {"timestamp":"2021-01-18T13:19:06.540640000Z","level":"error","source":"baggageclaim","message":"baggageclaim.fs.run-command.failed","data":{"args":["bash","-e","-x","-c","\n\t\tif [ ! -e $IMAGE_PATH ] || [ \"$(stat --printf=\"%s\" $IMAGE_PATH)\" != \"$SIZE_IN_BYTES\" ]; then\n\t\t\ttouch $IMAGE_PATH\n\t\t\ttruncate -s ${SIZE_IN_BYTES} $IMAGE_PATH\n\t\tfi\n\n\t\tlo=\"$(losetup -j $IMAGE_PATH | cut -d':' -f1)\"\n\t\tif [ -z \"$lo\" ]; then\n\t\t\tlo=\"$(losetup -f --show $IMAGE_PATH)\"\n\t\tfi\n\n\t\tif ! file $IMAGE_PATH | grep BTRFS; then\n\t\t\tmkfs.btrfs --nodiscard $IMAGE_PATH\n\t\tfi\n\n\t\tmkdir -p $MOUNT_PATH\n\n\t\tif ! mountpoint -q $MOUNT_PATH; then\n\t\t\tmount -t btrfs -o discard $lo $MOUNT_PATH\n\t\tfi\n\t"],"command":"/bin/bash","env":["PATH=/usr/local/concourse/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin","MOUNT_PATH=/worker-state/volumes","IMAGE_PATH=/worker-state/volumes.img","SIZE_IN_BYTES=258752974848"],"error":"exit status 1","session":"3.1","stderr":"+ '[' '!' -e /worker-state/volumes.img ']'\n+ touch /worker-state/volumes.img\n+ truncate -s 258752974848 /worker-state/volumes.img\n++ losetup -j /worker-state/volumes.img\n++ cut -d: -f1\n+ lo=\n+ '[' -z '' ']'\n++ losetup -f --show /worker-state/volumes.img\nlosetup: cannot find an unused loop device\n+ lo=\n","stdout":""}}
worker_1 | {"timestamp":"2021-01-18T13:19:06.540741000Z","level":"error","source":"baggageclaim","message":"baggageclaim.failed-to-set-up-driver","data":{"error":"failed to create btrfs filesystem: exit status 1"}}
worker_1 | error: failed to create btrfs filesystem: exit status 1
concourse-docker_worker_1 exited with code 1
I'm running into this issue on #70 but I'm not using docker swarm, just docker.
https://github.com/moby/moby/issues/24862 Looks like this wont be solved anytime soon.
I've managed to get a little further by replacing privileged: true
with cap_add: [NET_ADMIN]
and setting CONCOURSE_RUNTIME
to containerd
I'm now stuck on the following error:
{"timestamp":"2022-01-28T19:11:04.686241153Z","level":"error","source":"baggageclaim","message":"baggageclaim.api.volume-server.create-volume-async.failed-to-create","data":{"error":"operation not permitted","handle":"cbd0b4dd-84f8-4a9d-4b01-8ac8c27a968e","privileged":true,"session":"4.1.10","strategy":{"type":"import","path":"/usr/local/concourse/resource-types/docker-image/rootfs.tgz","follow_symlinks":false}}}
Which shows up as run check: find or create container on worker 3272415d73a2: failed to create volume
on web ui.
It might be because I'm trying to create a privileged docker-image container.
I've just brought up worker service in docker swarm successfully with sysbox-runc. But it requires me to set the default runtime of nodes that will run worker containers because docker stack does not suppport the runtime
prop on docker-compose.yml
:
# cat /etc/docker/daemon.json
{
"runtimes": {
"sysbox-runc": {
"path": "/usr/bin/sysbox-runc"
}
},
"default-runtime": "sysbox-runc"
}
When I try running the hello-world
example pipeline from the doc, I get a similar error:
run check: find or create container on worker dc72cdcf8d3d: failed to create volume
However, the reason showed in logs is strange:
concourse_worker.0.j6y38t6ei8o7@swarm-2 | {"timestamp":"2022-06-24T15:24:51.529209021Z","level":"error","source":"baggageclaim","message":"baggageclaim.api.volume-server.create-volume-async.failed-to-create","data":{"error":"invalid argument","handle":"7c77c360-c3ed-46aa-62bc-dae9695f43b6","privileged":false,"session":"4.1.87","strategy":{"type":"cow","volume":"df398c77-1fcc-42d2-5987-77b450071893"}}}
It's not something about permissions, but "invalid argument"
.
Here's my docker-compose.yml:
~It seems that the invalid argument
error is related to #42. But the workaround mentioned there (mount a volume to /worker-state
) does not work for me.~
Update: The real message describes the actual error is produced by kernel, and it can be viewed with journalctl -f
.
Jun 24 16:28:48 swarm-2 kernel: overlayfs: idmapped layers are currently not supported
It's said that the support for idmapped layers in overlayfs will be available in Linux 5.19 (current mainline kernel is 5.18).
It might be worth to write in the documentation this won't work on docker swarm due to the requirement of privileged mode. The database and web containers will work just fine however the worker node will fail with some very cryptic error messages like:
{"timestamp":"2019-09-30T14:31:24.520408669Z","level":"error","source":"guardian","message":"guardian.starting-guardian-backend","data":{"error":"bulk starter: mounting subsystem 'cpuset' in '/sys/fs/cgroup/cpuset': operation not permitted"}}
and{"timestamp":"2019-09-30T14:31:24.528488853Z","level":"error","source":"worker","message":"worker.garden-runner.logging-runner-exited","data":{"error":"Exit trace for group:\ngdn exited with error: exit status 1\ndns-proxy exited with nil\n","session":"8"}}
which disappears rather quickly because the following error gets spammed repeatedly{"timestamp":"2019-09-30T14:31:28.144058311Z","level":"error","source":"worker","message":"worker.beacon-runner.beacon.forward-conn.failed-to-dial","data":{"addr":"127.0.0.1:7777","error":"dial tcp 127.0.0.1:7777: connect: connection refused","network":"tcp","session":"4.1.5"}}
The web node also registers the worker node leading to further confusion. Hopefully this saves someone else a couple of painful hours.