Rootless docker mounts have the wrong permissions

Clockwork-Muse commented 3 years ago

As of Docker CE 20.10.0 (2020-12-08), support for docker to run rootless has moved out of experimental status and into mainline. While remote containers work, the namespace remapping means that any mounted directories end up with the root uid (uid 0), and so any container user (eg, via containerUser or remoteUser) lacks the permissions to modify these files/directories.

VSCode Version: 1.54.1
Local OS Version: Ubuntu 20.04
Remote OS Version: Any
Remote Extension/Connection Type: Docker rootless

Steps to Reproduce:

Install docker as normal. Do not do the normal post-install steps (eg, adding a docker group and adding the user).
Set up docker as rootless
Create a devcontainer - Ctrl+Shift+P -> Remote Containers: Add Development Container Configuration Files -> Alpine
Start container - Ctrl+Shift+P -> Remote Containers: Rebuild and Reopen in Container
Attempt to add a file to the mapped source directory. Get permissions error.

Strictly speaking, this is probably more a docker configuration issue. Unfortunately docker doesn't seem to have an equivalent to podman's --userns=keep-id (as is mentioned in some of the issues on here). Although my current project allows me to use a workaround of running as the container root, this isn't possible for everything (and does mean that some aspects of the development environment would no longer match any deployment environment).

Does this issue occur when you try this locally?: No Does this issue occur when you try this locally and all extensions are disabled?: No

chrmarti commented 3 years ago

With two UIDs (root and the regular user) in the container, you probably need to map these to two UIDs on the host. Is there a way to control that mapping with Docker?

Clockwork-Muse commented 3 years ago

sort of

You have to manually set up remapping like this, I think - although note the permissions don't actually reflect the "real" user (this is based on the namespacing feature). That's not really tenable.

chrmarti commented 3 years ago

I think you could add the host UID of the container user to the host folder's owning group to make that folder writeable from within the container. Haven't tried yet myself.

Clockwork-Muse commented 3 years ago

.... can you even add a non-existent user to a group? Otherwise you're having to set up the same set of ids as that example I linked. Even then, this likely requires sudo permissions to set up, which isn't a requirement for running rootless docker (assuming docker is already installed on the machine, rootless can be run/set up without sudo).

Chuxel commented 3 years ago

@chrmarti I was actually talking with @egamma about rootless support since I'd love to start making this the default recommendation for Linux users given the security benifits it provides. (I mean, one of the reasons Podman exists is to solve this historic Docker problem.)

The issue here is that IDs get remaped a bit more like we see on Windows and macOS historically. However, by default, everything lands as root which sucks. I've got an env setup this way and have 1000:1000 both inside and outside the container and have this problem as well. It's a different mode entirely.

To get things to work like they have historically, you need to add the following to the run command (or in Docker Compose): --userns host but I get an network error when doing this.

The one other bug is we appear to be not respecting DOCKER_HOST in the Clone Repository in Container Volume flow which breaks because you end up with a different socket path here (e.g./run/user/1000/docker.sock) - which unfortunately cuts off the alternative.

chrmarti commented 3 years ago

Would --userns host give up part of "rootless"?

One issue might be that if we need 2 users (root and 1000) in the container, we also need 2 users to map to on the host. (I might be wrong on this though, I haven't explored rootless much yet.)

Chuxel commented 3 years ago

@chrmarti The daemon is still running as your user, but I'm not confident enough to say for sure what the impact is - one of the reasons I tried to spin it up that way. I also walked the docker source code and it doesn't look like you can pass in an arbitrary namespace into the argument. Given the way rootless passes in the current user, it maps root to my local user (0 to 1000), so trying to expand the subuid range doesn't work ... docker bombs on start.

I'm almost wondering if the correct solution has to be to run the docker engine as another non-root user (e.g called "docker-root") which then should allow you to update /etc/subuid to map your actual user directly (e.g. clantz:1000:1). The "docker-root" user then is actually the one mapped to UID 0 in the container.

Aside from this permissions issue, things appeared to work as expected otherwise, so at a minimum getting the socket fix in for volumes would unblock that in the configuration here.

Chuxel commented 3 years ago

@chrmarti Near as I can tell, this can't be resolved with rootless docker. I suspect Docker needs to add the equivalent of keep-id (and may even require changes to rootlesskit).

I was able to get a separate user up running rootless docker, add an ACL to the docker socket to let another user use it, and get a user other than root for the bind mount, but it wasn't the same user ID and it was really hacky. If you restarted docker, you need to reapply the ACL. Just adding the user to a docker group doesn't work, because the GID on the socket isn't the typical "docker" one but a gid mapped one. Even if you do that, the other user does not (by design) have access to /run/root/ for the other user by default. Maybe there's some other option but I'm not seeing it at this moment.

So, I think volumes is the best recommendation in a rootless configuration at the moment unless you want to run as "root" in the container. That does work, but obviously can pose other issues.

chrmarti commented 3 years ago

So to rephrase: The way Docker implements rootless works well only when using root (mapped to the local user locally) inside the container? And this might be more general than just with how Docker is approaching this because there is no way to map two container user ids to a single host user id? (Partly because the reverse mapping must be unique.)

Chuxel commented 3 years ago

So to rephrase: The way Docker implements rootless works well only when using root (mapped to the local user locally) inside the container? And this might be more general than just with how Docker is approaching this because there is no way to map two container user ids to a single host user id? (Partly because the reverse mapping must be unique.)

Yeah, but there's a bit more since I tried to work around that. To recap:

You can use a different nonroot user inside the container, but bind mounts come across with translated IDs. When the IDs are the same user as the rootless docker daemon is running as, that translated ID will be 0. However, in the case of volumes, there's no issue because the IDs in the container will be the expected ones. So "Clone Repository in Container" (using a volume) should work as expected since the UID in the container of the non-root user would be 1000 and there's no translation at play.

In the experiment above, what I did was create a user with UID 5000 and ran docker rootless as that. I then switched to my user w/UID of 1000 and applied an access control list to allow me to use the docker socket. I also modified /etc/subuid with clantz:1000:1. However, when I started the container with a bind mount, the UID in the container was 65441 instead of 1000.

In concept, we could alter the container user's UID here, but the ACL has to be re-applied each time docker starts in this configuration. So I'm a bit stumped at the moment.

Clockwork-Muse commented 3 years ago

So "Clone Repository in Container" (using a volume) should work as expected since the UID in the container of the non-root user would be 1000 and there's no translation at play.

.... which means that you now have two copies of the repository in play (local and volume). Or you need to build the container ahead of time (or get it from a registry).

Even then, though, this isn't going to work for all projects - my current projects, for example, are doing graphics work, and so sometimes need access to the X11 socket (the deployed project is targeting EGL, so I don't require it for the built final product, but it's necessary for tools and debug output).

Chuxel commented 3 years ago

@Clockwork-Muse "Clone Repository in Container" supports the same container editing flow as using local bind mounts. So you don't need to do anything ahead of time. You'd be prompted as usual and can use the rebuild flow.

That said, yes, you end up with the source in a docker volume.

Interestingly, running as root in the container works fine with bind mounts and will not cause the historic problem of the wrong UID/GID being used. It maps to your local user. Docker (really rootlesskit) seems to hard code the mapping of your local user UID/GID to 0 in this mode - so unclear if there's an alternative. It certainly isn't obvious if there is one somewhere.

bhack commented 2 years ago

@chuxel Any news on this about non root user in the container with rootless Docker?

bhack commented 2 years ago

P.s. I found this upstream ticket https://github.com/moby/moby/issues/41497

Chuxel commented 2 years ago

To my knowledge, this simply isn't possible. You need to use root when running rootless with bind mounts.

leohxj commented 2 years ago

does it means must be in non-root(rootfull) mode? still not support rootless yet?

chrmarti commented 2 years ago

It works if you use root inside the container. Root in the container is mapped to your regular user on the host and with that the files in the bind mounts (e.g., the workspace folder) have the correct UID/GID (again root inside the container and regular user on host).

yaroslavsadin commented 1 year ago

@chrmarti So what is this the general recommendation? Log in as root? Would the remote container feature be documented for rootless docker?

chrmarti commented 1 year ago

@yaroslavsadin I don't think we have a good answer yet. The problem is that there is a single user on the host for which there are two users (root and the regular user) inside the container and there does not seem to be a way for "rootless" to map the two users inside the container to the single user on the host. (Which seems to make sense because the reverse mapping wouldn't know to which of the two container users to map to.)

Chuxel commented 1 year ago

The recommendation here likely needs to be to use "clone repository in contianer" volume and to then we need explore ways to move local content into a volume on create. #7184 also needs to be resolved. The only other option is to run as root in the container.

bhack commented 1 year ago

I suppose that it is already supported in podman with --userns=keep-id but we work on Docker.

Chuxel commented 1 year ago

Yeah the question here was with docker specifically. Podman drives different challenges. But, part of the reason that the dev container CLI is OSS (https://github.com/devcontainers/cli) is to allow the community to contribute support for alternatives where needed or to target tough problems that do not show up broadly. This extension uses the CLI under the hood. Outside of the simple scenarios, each engine seems to have their differences and related challenges. They aren't full drop-ins unfortunately. Unfortunately, for this particular case, there isn't a good answer for the Docker engine.

francoism90 commented 1 year ago

Any update on this? I'm having the same issue with Docker.

igormcoelho commented 1 year ago

Dear all, as a user of devcontainer with rootless docker and have already struggled with the issues discussed here, I'll share 4 different solutions I've found to deal with this situation (considering Linux/Ubuntu 22.04 with local user called "hostuser" with id 1001, which is non-standard and slightly more challenging that typical 1000 user): Problem: Standard usage leads to error on both docker rootless and podman (user does not understand why nothing works) //"remoteUser": "root"

it either logs in as vscode user, but files are all owned by root (so, Permission Denied)
it fails to build image and launch container

Possible Solution 1) docker rootless - use root on container: "remoteUser": "root" This allows easy interaction with local files, as mapping will be directly done of internal container user "root" (0) to "hostuser" (1001). I use this often when developing simple C/C++/Python libraries, since no setup is necessary, as long as image supports root. Note that: internal vscode user (1000) is mapped as external 166535 (BROKEN)

Possible Solution 2) docker rootless - use vscode user on container and chown your files: // "remoteUser": "vscode" If you want to keep your regular user inside the container and use vscode for development, a solution is to change the permissions of your project files: $ rootlesskit chown -R hostuser:hostuser projectfolder/

Now, consider the following matching of subuid:

$ grep hostuser: /etc/subuid
hostuser:165536:65536

Your projectfolder files should become 166536 (if your user is 1001 like mine... if it's 1000, the value will be lower) Now, you can use devcontainers normally, but won't be able to directly access your files on host! To get them back to "normal", just do the opposite matching from root to your hostuser: rootlesskit chown -R root:root projectfolder/

Now your files are back to your user domain. It means that docker works like this: root (0) on container means (hostuser) 1001 outside; and vscode user (1001) inside means 166536 outside.

Possible Solution 3) use podman rootless - "remoteUser": "root" Go to "Dev Containers: Settings for Dev Containers" and update Docker Path to podman (you need to install podman first) Then, add this option: "remoteUser": "root" You will be able to log as root user, and all files created by root will be mapped to hostuser (1001). The strange thing here is that internal vscode user gets wrong id (1000 instead of 1001), so:

internal root (0) is mapped as external hostuser (1001)
internal vscode user (1000) is mapped as external 166535 (BROKEN)
internal user 1001 (does not exist) is mapped as external 166536 (BROKEN)

Although 1001 <-> 166536 is consistent with docker, this is likely the most inconsistent strategy, as internal vscode user is broken ("containerUser": "vscode" option does not help either)

Possible Solution 4) use podman rootless - matching ids with "runArgs": ["--userns=keep-id"] Go to "Dev Containers: Settings for Dev Containers" and update Docker Path to podman (you need to install podman first) Then, add these options:

    "runArgs": ["--userns=keep-id"],
    "containerUser": "vscode",

With this option, you log as vscode (1001) in container, with correct matching id to hostuser (1001). If you create some file as root (or with sudo), some interesting behavior happens:

internal vscode user (1001) is mapped as external hostuser (1001)
internal root (0) is mapped as external 165536 user, but in hostuser (1001) group

It means that all files are accessible in host, even root-created ones.

So, for podman, all other configurations I tried with different remoteUser, containerUser and userns got all innefective or inconsistent/broken, so I only recommend these two above. Sometimes containers seemed broken, and rebuild option didn't help... my only solution was to podman system prune -a and destroy everything! then it worked again.

In short, I manage to make things work, and usually adopt either Solution 1 (root on docker rootless), or Solution 4 (userns on podman rootless), so I would recommend the extension to:

try to detect if docker is rootless (with docker info maybe?)
generate devcontainer.json file according to some valid strategy
- if using docker rootless, enable "remoteUser": "root" by default
- if using podman, enable both "runArgs": ["--userns=keep-id"] and "containerUser": "vscode" by default

Finally, I also manage to work with "docker-from-docker" (or Dood) on docker rootless (note that docker-in-docker "dind" typically won't work due to the lack of --privileged option on rootless), by using this:

"mounts": [
{
     "source": "/run/user/1001/docker.sock",
     "target": "/var/run/docker-host.sock",
     "type": "bind"
}
],
"remoteUser": "root",

and injecting variable LOCAL_WORKSPACE_FOLDER to my .env files for docker-compose, inside the project, to properly mount the paths related to host... but I haven't tried this one without root on docker rootless or even podman (too many alternatives!). Hope this helps, good luck!

Chuxel commented 1 year ago

Yeah, the core idea behind rootless is to allow you to use root within the container, so that's always the path of least resistance. This is one of the reasons that we try to ensure both the root and non-root user work as expected in images and Features.

FWIW - All of this logic is in the open source CLI. So PRs are welcome to make improvements like these. https://github.com/devcontainers/cli

igormcoelho commented 1 year ago

The problem is that there is a single user on the host for which there are two users (root and the regular user) inside the container and there does not seem to be a way for "rootless" to map the two users inside the container to the single user on the host

In fact @chrmarti , both podman and docker seem to use similar strategies, as each subuser must have a matching id on host. Only the count is slightly different. Consider the following:

$ echo $UID
1001
$ grep hostuser: /etc/subuid
hostuser:165536:65536

for docker and rootlesskit, it seems to map:

0 to hostuser (1001 in my case, but commonly 1000)
1 to 165536, 2 to 165537, ..., 999 to 166534, 1000 to 166535, 1001 to 166536, 1002 to 166537; so: i != 0 to (165536 + i -1)

for podman --userns=keep-id, it seems to map:

1001 to hostuser (1001 in my case)
0 to 165536, 1 to 165537, 2 to 165538, ..., 999 to 166535, 1000 to 166536, 1001 to 1001, 1002 to 166537; so: i > 1001, (165536 + i -1) same as docker, and i<1001, (165536 + i).

So, the convenience of podman is that it really matches the file uid of host and container with exactly the same number, while docker implicitly considers 0 of container to be host uid... So, effective change in file ids could be necessary to work as non-root docker rootless container (as requested in this issue), such as $ rootlesskit chown -R 1001:1001 projectfolder/ and also $ rootlesskit chown -R root:root projectfolder/.

ajoshiusc commented 8 months ago

vscode connects as a user 'vscode', while it needs to connect as 'root' to have the correct permissions. (This is root user in the container, not the root user on the host. so should be fine security-wise. In your devcontainer.json // Uncomment to connect as root instead. More info: https://aka.ms/dev-containers-non-root. // "remoteUser": "root"

Clockwork-Muse commented 8 months ago

@ajoshiusc - Yes, uid mapping means that maps to the external user uid... For file access, and not certain other things. For instance, it will mean attempting to do docker-outside-of-docker (mapping the docker socket into the container) will fail because the uid is wrong, as will similar attempts with .X11 ports.
Further, it's not the preferred solution, because there are certain portions of the kernel that aren't really secured against this. Additionally, there are any number of tools or other scenarios where you either don't want to or can't run it as root.

Which is why mapping the internal uid to the external uid is vastly better.

blurayne commented 6 months ago

Another scenario: i usually use podman instead of docker. I also installed rootless docker and it can run side by side - except under VSCode Dev Container Extension. I need to run rootful docker (in a Lima VM or where-so-ever) since I do run a terraform project which just assumes rootful docker and nothing else (wich is also default for most of our developers) :/

I used following by settings > docker.environment in devcontainer.json but also settings.json

"docker.environment": {
  "DOCKER_HOST": "unix:///run/docker.sock",
  "DOCKER_CONTEXT": "rootful"
}

But unfortunately the result is:

$ ps aux | grep -P docker.+exec.+bash
ctang    1826105  0.0  0.0 736612 20952 pts/0    Sl   09:19   0:00  |   |   \_ docker exec -i -u vscode -e SHELL=/bin/bash -e VSCODE_AGENT_FOLDER=/home/vscode/.vscode-server -w /home/vscode/.vscode-server/bin/0ee08df0cf4527e40edc9aa28f4b5bd38bbff2b2 4f67dfcee967d5aa1652d72393a299f161b37ebf89a38e0d2de414d64a3d2321 /home/vscode/.vscode-server/bin/0ee08df0cf4527e40edc9aa28f4b5bd38bbff2b2/bin/code-server --log debug --force-disable-user-env --server-data-dir /home/vscode/.vscode-server --telemetry-level all --accept-server-license-terms --host 127.0.0.1 --port 0 --connection-token-file /home/vscode/.vscode-server/data/Machine/.connection-token-0ee08df0cf4527e40edc9aa28f4b5bd38bbff2b2 --extensions-download-dir /home/vscode/.vscode-server/extensionsCache --start-server

$  cat /proc/1826105/environ | tr '\0' '\n'  | grep -i docker
DOCKER_BUILDKIT=1
DOCKER_CONTEXT=rootful
DOCKER_HOST=unix:///run/user/1000/docker.sock

Also when I add

"containerEnv": {
    "HOST_DOCKER_HOST": "${localEnv:DOCKER_HOST}",
    "HOST_DOCKER_CONTEXT": "${localEnv:DOCKER_CONTEXT}"
},

Within the container I get:

 $ printenv | grep DOCKER
HOST_DOCKER_CONTEXT=rootful
HOST_DOCKER_HOST=unix:///run/user/1000/docker.sock

And it is run as rootless container. I tried to wrap vscode in en execution environment (unsetting DOCKER_HOST export DOCKER_CONTEXT). While the build happens on rootful docker execution is still happening in rootless docker.

TL;DR runArgs and y settings.docker.environment is useless since the extension doesn't fully support it in it's execution. Also global settings.json docker env seems not to be taken into account on some execution.

This really needs to improve. Also general support for rootless. It doesn't make sense to base something on container environment when users are forced to run in rootful environments (even if thats in a VM, but again containers vs VM??!)

microsoft / vscode-remote-release

Rootless docker mounts have the wrong permissions #4646