ros-infrastructure / ros_buildfarm

ROS buildfarm based on Docker
Apache License 2.0
81 stars 96 forks source link

User scripts with rolling (ubuntu noble) fail from existing UID=1000 #1060

Closed rkent closed 2 weeks ago

rkent commented 3 months ago

Docker image ubuntu:noble now ships with an existing user "ubuntu" with UID=1000. So when you try to create a new user "buildfarm" with UID=1000 it fails. Somehow this got fixed on the actual buildfarm, as the buildfarm ID is currently being set at 1001 in those cases.

But when a user tries to run scripts, typically with their UID=1000, it fails.

For example, if I try to run ros_buildfarm.scripts.doc.generate_doc_script for rolling (which uses ubuntu:noble) the created job will fail, because the Dockerfile tries to RUN useradd -u 1000 -l -m buildfarm which fails since there is already a user ubuntu with UID=1000

I've played around with the easiest ways to fix this. In a Dockerfile you can add:

RUN if [ $(id -nu 1000) ]; then userdel -r $(id -nu 1000); fi

before the RUN useradd ... and it works whether a user with UID=1000 exists or not (that is, on both ubuntu:noble as well as earlier versions). In an empy script on ros_buildfarm you add instead:

RUN if [ $(id -nu @uid) ]; then userdel -r $(id -nu @uid); fi

(I tried instead RUN useradd -o -u @uid -l -m buildfarm as well, allowing a duplicate UID, but that does not work because the buildfarm gets confused about whether we are using /home/ubuntu or /home/buildfarm)

I could do a PR for this, but that issue occurs in a lot of places, every *.Dockerfile.em. Is supporting user scripts still a thing in ros_buildfarm? Am I doing something wrong running them? If you want a PR, would you prefer this in a snippet?

nuclearsandwich commented 3 months ago

I could do a PR for this, but that issue occurs in a lot of places, every *.Dockerfile.em. Is supporting user scripts still a thing in ros_buildfarm?

Scripts are definitely still supported! This affects people running with UID=1000 as the local user on the host. Our build farms get "lucky" because the jenkins-agent is created after another user and gets UID=1001.

Am I doing something wrong running them? If you want a PR, would you prefer this in a snippet?

A PR here would be welcome. I think @cottsay's hack was userdel ubuntu ||: so that we just unconditionally removed it before adding the buildfarm user. Since we have to touch every Dockerfile.em with the useradd logic, updating it to use a centralized snippet would be most welcome!