adoptium / infrastructure

This repo contains all information about machine maintenance.
Apache License 2.0
84 stars 100 forks source link

centos7_docker_image_updater failing on RISC-V #3550

Open sxa opened 2 months ago

sxa commented 2 months ago

Ref: https://ci.adoptium.net/job/centos7_docker_image_updater/504/execution/node/89/log/

+ docker build -t adoptopenjdk/ubuntu2004_build_image:linux-riscv64 --build-arg git_sha=2a029fd406c6f35f70c32ef5a814933fa11297ee -f ansible/docker/Dockerfile.Ubuntu2004-riscv64 .
ERROR: permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/_ping": dial unix /var/run/docker.sock: connect: permission denied

Runs on docker&&linux&&riscv64 and has selected Scaleway machine 1 for that particular run

sxa commented 2 months ago

I've restarted the jenkins agent on the -1 machine in case it wasn't in the correct docker group before. I've been able to run docker operations from the jenkins user on the command line without issues, so hopefully it will be ok. Looking at -8 it is in a similar situation where running id from the jenkins script console shows that it is NOT in the docker group, but logging into the machine.

All of the others seemed similar - I verified 2 and, 8. On that basis I've restarted the agent on each of the boards 1 through 10

Noting that not all of them enabled the docker group after the restart so potentially additional remediation will be required on the machines.

sxa commented 2 months ago

I've switched docker and dockerBuild for dockerX and dockerBuildX to avoid problems until someone can fix this on the machines.