emacs-eldev / eldev

Elisp development tool
https://emacs-eldev.github.io/eldev/
GNU General Public License v3.0
230 stars 17 forks source link

docker command gets "permission denied" (xhost +local:root does not fix) #85

Closed Trevoke closed 7 months ago

Trevoke commented 1 year ago

I am on Ubuntu 22, using the most recent docker desktop installation available.

If I use docker run -it --rm silex/emacs:27.2 then this properly launches emacs in the docker container.

When I try eldev -dt docker 27.2 eval 1 I get this:

Started up on Mon Apr 24 17:42:37 2023
Running on GNU Emacs 27.1 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.20, cairo version 1.16.0)
 of 2020-09-19
Project directory: ‘/home/stag/src/projects/org-gtd.el/’
No file ‘/home/stag/.eldev/config’, not applying user-specific configuration
Loading file ‘Eldev’...
Using package archive ‘gnu’ at ‘https://elpa.gnu.org/packages/’ with priority 300
Using package archive ‘gnu-devel’ at ‘https://elpa.gnu.org/devel/’ with priority 190
Using package archive ‘melpa-stable’ at ‘https://stable.melpa.org/packages/’ with priority 200
Using package archive ‘melpa-unstable’ at ‘https://melpa.org/packages/’ with priority 100
No file ‘Eldev-local’, not customizing build
Executing command ‘docker’...
Full command line to run a Docker process:
  /usr/local/bin/docker run --rm -e 'HOME=/org-gtd.el/.eldev/docker-home' -u 1000:1000 -v /home/stag/src/projects/org-gtd.el/:/org-gtd.el -w /org-gtd.el -v /home/stag/.eldev/27.1/bootstrap/eldev-1.3.1/bin/eldev:/org-gtd.el/.eldev/docker-home/bin/eldev -v /home/stag/.eldev/global-cache:/org-gtd.el/.eldev/docker-home/.eldev/global-cache silex/emacs:27.2 sh -c 'export PATH="$HOME/bin:$PATH" && eldev eval 1'
Bootstrapping Eldev for Emacs 27.2 from MELPA Stable...

Creating directory: Permission denied, /org-gtd.el/.eldev/docker-home/.eldev/27.2

Docker process exited with error code 255
Finished erroneously on Mon Apr 24 17:42:38 2023

If I create directories all the way down to mkdir -p .eldev/docker-home/.eldev/27.2/bootstrap then my next failure is

Full command line to run a Docker process:
  /usr/local/bin/docker run --rm -e 'HOME=/org-gtd.el/.eldev/docker-home' -u 1000:1000 -v /home/stag/src/projects/org-gtd.el/:/org-gtd.el -w /org-gtd.el -v /home/stag/.eldev/27.1/bootstrap/eldev-1.3.1/bin/eldev:/org-gtd.el/.eldev/docker-home/bin/eldev -v /home/stag/.eldev/global-cache:/org-gtd.el/.eldev/docker-home/.eldev/global-cache silex/emacs:27.2 sh -c 'export PATH="$HOME/bin:$PATH" && eldev eval 1'
Bootstrapping Eldev for Emacs 27.2 from MELPA Stable...

Package `eldev-' is unavailable

Docker process exited with error code 255
Finished erroneously on Mon Apr 24 17:56:34 2023

I'm not really sure what is happening here, are you able to provide help?

doublep commented 1 year ago

I don't really know Docker that well. I did something with it in my job, and this part: -u 1000:1000 reminds me of the troubles I had and never found a proper way to solve — only a workaround.

What does command $ id say? I.e. what is your Linux user's id?

Trevoke commented 1 year ago

Here's its output: uid=1000(stag) gid=1000(stag) groups=1000(stag),4(adm),24(cdrom),27(sudo),30(dip),44(video),46(plugdev),108(kvm),109(render),120(lpadmin),131(lxd),132(sambashare),997(nordvpn),998(docker)

So my user id is 1000 and the eponymous group id is also 1000.

doublep commented 1 year ago

Well, don't really know what's going on.

My system (fairly old):

~/git/org-gtd.el$ uname -a
Linux gonzo 5.10.0-9-amd64 #1 SMP Debian 5.10.70-1 (2021-09-30) x86_64 GNU/Linux

Just to be sure I deleted everything Eldev-specific first (rm -rf .eldev and erased Eldev-local). For comparison, here is what I get:

~/git/org-gtd.el$ eldev -dt docker 27.2 eval 1
Started up on Tue Apr 25 19:23:42 2023
Running on GNU Emacs 29.0.60 (build 1, x86_64-pc-linux-gnu, GTK+ Version 2.24.33, cairo version 1.16.0)
 of 2023-04-08
Project directory: ‘/home/paul/git/org-gtd.el/’
Loading file ‘/home/paul/.eldev/config’...
Loading file ‘Eldev’...
Using package archive ‘gnu’ at ‘https://elpa.gnu.org/packages/’ with priority 300
Using package archive ‘melpa-stable’ at ‘https://stable.melpa.org/packages/’ with priority 200
Using package archive ‘melpa-unstable’ at ‘https://melpa.org/packages/’ with priority 100
Loading file ‘Eldev-local’...
Executing command ‘docker’...
Full command line to run a Docker process:
  /usr/bin/docker run --rm -e 'HOME=/org-gtd.el/.eldev/docker-home' -u 1000:1000 -v /home/paul/git/org-gtd.el/:/org-gtd.el -w /org-gtd.el -v /home/paul/eldev:/eldev -v /home/paul/.eldev/global-cache:/org-gtd.el/.eldev/docker-home/.eldev/global-cache -v /home/paul/.eldev/config:/org-gtd.el/.eldev/docker-home/.eldev/config silex/emacs:27.2 sh -c 'ELDEV_LOCAL=/eldev /eldev/bin/eldev eval 1'
[1/7] Installing package `org' (9.6.4) from `gnu'...
[2/7] Installing package `org-edna' (1.1.2) from `gnu'...
[3/7] Installing package `s' (1.13.0) from `melpa-stable'...
[4/7] Installing package `dash' (2.19.1) from `gnu'...
[5/7] Installing package `f' (0.20.0) from `melpa-stable'...
[6/7] Installing package `org-agenda-property' (1.3.1) from `melpa-stable'...
[7/7] Installing package `transient' (0.3.7) from `gnu'...
[1/1] Installing package `buttercup' (1.31) from `melpa-stable'...
Warning (org-gtd): 

|--------------------------|
| WARNING: action required |
|--------------------------|

Upgrading to 2.1.0 requires changing the org-edna triggers for the project
categories. Failure to do so means your projects will end up in inconsistent
states.

See the documentation for instructions to upgrade (C-h i, then find org-gtd),
then add the following setting to your config file (BEFORE ORG-GTD LOADS)
to disable this warning.

(setq org-gtd-update-ack "2.1.0")

1
Finished successfully on Tue Apr 25 19:24:12 2023

Can you also try with cleaning up .eldev in the project?

doublep commented 1 year ago

Also, similar result if I use Emacs 27 or 26 for the "outer" Eldev, not Emacs 29. I.e. this doesn't appear to depend on Emacs installed on your normal OS.

doublep commented 1 year ago

One more thing: you may want to add -dt to the command line also between docker 27.2 and eval 1. These options will be used as global options for Eldev process running inside the container; the first -dt only applies to the "outer" process that actually executes on your normal OS.

juergenhoetzel commented 1 year ago

I'm not really sure what is happening here, are you able to provide help?

I don't use docker but podman which reports the same issue:

Opening output file: Permission denied, /eldev/.eldev/ever-initialized

Workaround: Map HOST UID to same UID within container:

;Eldev-local

(setq eldev-docker-executable "podman"
      eldev-docker-run-extra-args '("--userns=keep-id"))

Root cause: In rootless containers a user name‐space is always used, and root in the container will by default corre‐ spond to the UID and GID of the user invoking Podman.

I wonder why the explicit -u UID:GID was added in the first place? Because this causes mapping to a non-privileged UID (according to /etc/subuid).

doublep commented 1 year ago

I don't know, maybe @LaurenceWarne, the author of Docker-related code, can comment?

I also see now that question about UID was pointless, as 1000 is not hardcoded, but is calculated in runtime with functions user-uid and group-gid.

Trevoke commented 1 year ago

So - the workaround indicated by @juergenhoetzel worked. In addition to this, I had to edit /etc/containers/registries.conf and add the line `unqualified-search-registries = ["docker.io"] to it.

I found the workaround defined here https://unix.stackexchange.com/a/701785/2015 Apparently if we're going to go to docker.io we need to fully qualify it now, so docker.io/silex/emacs:27.2, I believe.

Thanks! Now I can test my package locally on multiple versions of emacs, which is pretty important. I'll keep this opened for now, I think @doublep can decide when this is closable better than me.

LaurenceWarne commented 1 year ago

Like @doublep I can't reproduce with docker (20.10.5), but I can with podman as @juergenhoetzel describes. iirc permission errors with the mount points was a big problem in the initial integration: https://github.com/doublep/eldev/pull/53#issuecomment-955730560, so it looks like the hack outlined there only works with docker, not podman, or I'm guessing docker desktop for @Trevoke (I've not tried with docker desktop).

I'm glad @juergenhoetzel's workaround worked, though I admit I'm a bit confused why it works :sweat_smile:. I'm not really familiar with podman, reading the rootless documentation:

If your container runs with the root user, then root in the container is actually your user on the host. UID/GID 1 is the first UID/GID specified in your user's mapping in /etc/subuid and /etc/subgid, etc. If you mount a directory from the host into a container as a rootless user, and create a file in that directory as root in the container, you'll see it's actually owned by your user on the host.

So what @juergenhoetzel says makes sense, since Silex's container's run as root. However, doesn't this mean no special configuration is needed? Running:

podman run --rm -e 'HOME=/org-gtd.el/.eldev/docker-home' -v /home/laurencewarne/projects/org-gtd.el/:/org-gtd.el -w /org-gtd.el -v /home/laurencewarne/.cache/eldev/28.1/bootstrap/eldev-1.3.1/bin/eldev:/org-gtd.el/.eldev/docker-home/bin/eldev -v /home/laurencewarne/.cache/eldev/global-cache:/org-gtd.el/.eldev/docker-home/.eldev/global-cache silex/emacs:27.2 sh -c 'export PATH="$HOME/bin:$PATH" && eldev '\''--color=always'\'' eval 1'

(this is the command eldev -dt docker 27.2 eval 1 uses to start the container, but without the -u flag, and also without the --userns flag) I find confirms this and gives no errors (e.g. the dirs created on the container correspond to my uid/gid on the host). I'd be curious to know if this also works for @juergenhoetzel and @Trevoke.

While testing, I've also noticed that UI support has been dropped for the images currently being used, so I don't think eldev docker xyz emacs is going to work anymore :disappointed:.

doublep commented 1 year ago

So, while fixing issue #86, I ran into a similarly-looking problem with permissions during standard GitHub CI. Managed to solve it in commit 506b02e. Can you check if this fixes the bug that you encountered too?

While testing, I've also noticed that UI support has been dropped for the images currently being used, so I don't think eldev docker xyz emacs is going to work anymore disappointed.

Unfortunately, this is not something I can fix. Maybe you should discuss that with Silex?

doublep commented 1 year ago

Just as a note: the reason appears to have been ~/.cache/eldev/global-cache or ~/.eldev/global-cache (depending on whether your installation predates XDG or not) created with root as the owner, not your normal user. If that's already the case, the simplest fix would be to just delete it. After commit 506b02e Eldev's command docker shouldn't create the directory with wrong owner anymore, but you need to manually delete or chown it if that has happened already.

Trevoke commented 1 year ago

So:

  1. I am on eldev 1.4
  2. I have ~/.eldev/global-cache, but it has been created with my regular user.
  3. My current Eldev-local on my project has (setq eldev-docker-executable "podman" eldev-docker-run-extra-args '("--userns=keep-id"))
  4. I have /etc/registries.conf with unqualified-search-registries = ["docker.io"]

With this setup, I can do eldev -dt docker 27.2 -C -dtp test.


If I comment out the two lines in Eldev-local then the same permission errors that we had to deal with above appear.

Cleaning out ~/.eldev/global-cache does not change anything. Cleaning out the project's .eldev directory does not change anything

So.. I'm sticking to the podman path for now, I suppose?

doublep commented 1 year ago

Crap... Can you test if there are root-created files in .eldev or other files that do not belong to your user? Can you post exact output of failing eldev -dt docker 27.2 -dtp test (it's probably the same as in the original post, but I'd like to be sure)? Do you have any ideas why it works without problems on some machines, but not the others, and why using podman makes a difference?

LaurenceWarne commented 1 year ago

Do you have any ideas why it works without problems on some machines, but not the others, and why using podman makes a difference?

I think as @juergenhoetzel says the explicit -u we are using screws with rootless containers, @Trevoke eldev -dt docker 27.2 eval 1 should print the full docker command, if you remove -u xyz:xyz flag from the command and run it, does that solve the permissions issue?

If so, the use of the -u could possibly be special cased by eldev (e.g. if eldev-docker-executable = podman don't add the -u flag)?

doublep commented 1 year ago

I think as @juergenhoetzel says the explicit -u we are using screws with rootless containers

What is a rootless container (sorry, I know, I suck at dockering)? Why is it that we use the same Silex-provided images, but on my machine everything works fine, yet on Trevoke's it doesn't?

As I understand it, -u is needed because Emacs inside Docker may want to write something, and without -u we can run into creating files with wrong owner. Or even when reading we could permission-denied errors if the user inside Docker has a different uid compared to your normal user. Yet with Trevoke's setup we have it vice-versa, with the supposed fix of -u screwing things up...

LaurenceWarne commented 1 year ago

What is a rootless container (sorry, I know, I suck at dockering)? Why is it that we use the same Silex-provided images, but on my machine everything works fine, yet on Trevoke's it doesn't?

I'm very much a podman noob myself :sweat_smile:, I believe it's just a way to run containers without root priveliges (e.g. no need to run as root or add yourself to the docker group). If you're interested I found https://developers.redhat.com/blog/2020/09/25/rootless-containers-with-podman-the-basics interesting/useful (podman should also be available on most package managers I believe if you wanted to try it out?).

As I understand it, -u is needed because Emacs inside Docker may want to write something, and without -u we can run into creating files with wrong owner. Or even when reading we could permission-denied errors if the user inside Docker has a different uid compared to your normal user. Yet with Trevoke's setup we have it vice-versa, with the supposed fix of -u screwing things up...

I think this is the case when using normal docker (we need -u), but with rootless/podman I think this is not necessary and/or leads to issues: https://github.com/containers/podman/blob/main/docs/tutorials/rootless_tutorial.md#using-volumes.

doublep commented 1 year ago

Ok, I think we are mixing up two issues here. Let's exclude Podman from consideration for now — maybe we simply need to add another command for it. Later.

The original issue was that it fails with Docker. @Trevoke: Brief googling suggest that Docker also has a rootless mode. Do you run it in this mode? Here it runs as a root-level daemon, suggesting that it is not rootless.

Trevoke commented 1 year ago

I copied the docker command and removed the -u, which is to say:

/usr/local/bin/docker run --rm -e 'HOME=/org-gtd.el/.eldev/docker-home' -v /home/stag/src/projects/org-gtd.el/:/org-gtd.el -w /org-gtd.el -v /home/stag/.eldev/27.1/bootstrap/eldev-1.4/bin/eldev:/org-gtd.el/.eldev/docker-home/bin/eldev -v /home/stag/.eldev:/org-gtd.el/.eldev/docker-home/.eldev silex/emacs:28.2 sh -c 'export PATH="$HOME/bin:$PATH" && eldev '\''--color=always'\'' -C -dt test'

I commented out the Eldev-local, so it did not run with podman and it did not run with the --userns=keep-id.

And.. It worked.

FURTHER, now when I ask it to run with eldev -dt docker 27.2 -C -dt test or the 28.2 equivalent, IT STILL WORKS.

It might be worth noting that the command generated is going for /home/stag/.eldev/27.1 even though I asked for 28.2 which we can find in the command at silex/emacs:28.2.

This might indicate that the fix you had suggested was correct, but that I didn't clean up the directory I needed to, because .. Maybe there's another bug somewhere in what eldev setup is being used?

doublep commented 1 year ago

OK, I really suspect that it is a problem of rootless vs. "normal" container manager. I guess that's exactly what @juergenhoetzel said, but I wasn't able to comprehend it then with my poor knowledge of this stuff. Anyway, here is what I have here now:

paul@gonzo:~/eldev$ docker info | grep -i root
 Docker Root Dir: /var/lib/docker
paul@gonzo:~/eldev$ podman info | grep -i root
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
  graphRoot: /home/paul/.local/share/containers/storage
  graphRootAllocated: 116822036480
  graphRootUsed: 98728112128
  runRoot: /run/user/1000/containers

I.e. Docker is "rootful" (according to what I find on the internet, /var/lib/docker means exactly that), Podman is rootless. I can now reproduce the problem using Podman:

paul@gonzo:~/eldev$ eldev -S "(setf eldev-docker-executable \"podman\")" docker 27 eval 1
Opening output file: Permission denied, /eldev/.eldev/ever-initialized
Run with `--debug' (`-d') option to see error backtrace

Docker process exited with error code 1

But with Docker it works as it always has (on this machine), and I assume this is because the current implementation is tailored to "rootful" container managers:

paul@gonzo:~/eldev$ eldev -S "(setf eldev-docker-executable \"docker\")" docker 27 eval 1
1

@Trevoke: I presume that your original post indicates that Docker on your machine is rootless (unlike here, for example). Can you confirm that?

Is there a good way to find if Docker/Podman is rootless on given system? I guess we cannot have a single command that works for both cases, but maybe we can have two — and if we can find a way to choose which to use, we can resolve this issue.

Trevoke commented 1 year ago

I am using ... I guess... Rooted docker?

stag@zeus projects/org-gtd.el % docker info | grep -i root
 Docker Root Dir: /var/lib/docker

stag@zeus projects/org-gtd.el % podman info | grep -i root
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
  graphRoot: /home/stag/.local/share/containers/storage
  runRoot: /run/user/1000/containers
doublep commented 1 year ago

Crap, I don't understand anything again. For a while, I have been thinking the problem is because of rootless Docker...

I have ~/.eldev/global-cache, but it has been created with my regular user.

Please make sure there are no root-created files in ~/.eldev at all, e.g. ~/.eldev/27.1/bootstrap. E.g.:

$ find ~/.eldev | xargs ls -l | grep root
doublep commented 1 year ago

Please make sure there are no root-created files

With recently released 1.4.1 you can do this simply by executing eldev doctor (ignore other warnings, if any), or eldev doctor eldev-file-owners.

doublep commented 7 months ago

I committed some changes to Eldev that hopefully should improve the situation. As I now have a different (version of) OS, I was able to experiment more. As far as I understand it, the original problem was caused by installing podman-docker package that seems to make Podman substitute for Docker (they are mostly compatible). However, as Podman, at least on Debian, defaults to rootless mode, Eldev would die with this problem.

It seems that in rootless mode we can just avoid passing option -u to Docker/Podman altogether instead of using --userns as a fix on top of it. If anyone is interested in trying this out, documentation describes a way to test unreleased Eldev.

Also if you are familar with this stuff, please check the commit (it is small) and comment if something could be done better instead.

doublep commented 7 months ago

Should be hopefully fixed in Eldev 1.10. Feel free to reopen if the fix is not enough.