[enhancement]: Add option to skip GrantContainerUserSUDOPrivilege in job container init

joshchngs commented 1 year ago

Describe your feature request here

I would like to be able to skip this section, per container definition:

https://github.com/microsoft/azure-pipelines-agent/blob/0b8aefd218bc7a6d39a4b406295c873c494c4857/src/Agent.Worker/ContainerOperationProvider.cs#L636-L646

If my image user already has sudo privilege, these steps are unnecessary. If I define the container with --user <something> they can't work. I want to be able to tell the agent that I've already handled setting up sudo for the job user.

Why?

The documentation says that the container user must be able to run these commands. In reality, only root can do this, as su will prompt for a password otherwise. Therefore, it is only possible to use images with USER=root.

Side note, I don't understand the /etc/sudoers write. This file should be read-only (mode 0440), and AFAIK it is in most base images. I don't understand how this command ever succeeds.

There are a few use cases where USER=root is not possible, including mine. The highlighted docker execs are the root cause, and are mentioned in a few places.

Workarounds

The links above mention a few workarounds, such as passing --user 0:0 as a docker create option. This doesn't work for all use cases.

I'm currently doing:

Dockerfile

RUN chmod u+s /usr/sbin/groupadd /usr/sbin/usermod && \
    chmod u+w /etc/sudoers
COPY su_hack.sh /bin/su

su_hack.sh

#!/usr/bin/env bash
echo "'su' has been disabled in this container"

Caveats

The use case enabled by adding the option is only viable if the image already has a correctly configured user with matching UID/GID for the container init step to find. This means that either the agent host user UID/GID needs to be controlled, or the image needs to be rebuilt on each agent before it's used.

vmapetr commented 1 year ago

Hi, @joshchngs thanks for reporting! We are currently working on more prioritized issues but will get back to this one soon.

pixdrift commented 1 year ago

Thanks for raising this issue, I think this whole block of code that interacts with and modifies the container needs discussion, it causes us no end of issues when running generic container images. https://github.com/microsoft/azure-pipelines-agent/blob/master/src/Agent.Worker/ContainerOperationProvider.cs#L544-L764

I think more 'configuration knobs' to turn off the agent behaviour around creation, initialisation, and execution of containers, including this entire block of code would be a useful addition.

The agent always mounting in the docker.socket (or all the bind mounts for that matter) should probably be optional too!

github-actions[bot] commented 8 months ago

This issue has had no activity in 180 days. Please comment if it is not actually stale

joshchngs commented 8 months ago

@vmapetr I think your bot is throwing shade. Any progress?

asad26 commented 2 months ago

Any update on this. When can we have such feature? Thanks

microsoft / azure-pipelines-agent