Closed phil294 closed 4 years ago
Thank you for the bug report!
I found this error message:
/tmp/containerrootrc: 11: /tmp/containerrootrc: cannot create /x11docker/container.log: Permission denied
It is this line in containerrootrc
:
exec 1>>/x11docker/container.log 2>&1
It only serves to redirect some log output. I'll think about a possible alternative. The error message is confusing because the file exists for sure.
I got around to check the commits already:
bcb791cc9d581e1cb404761c52127ca63d26d110 is when the issue arose. The log says x11docker[14:18:12,517]: Waiting since 2s for /x11docker/containerrootrc.ready to have content, will wait up to 32000 seconds.
. Do you need any more verbose logs from around this commit?
Part of the problem (seeing that this issue hasnt been opened yet) may be that I set the permissions on ~/.local
and most other folders in $HOME to drwx------
700. I think it makes the most sense as the contents of this folder arent any other user's business, be it system or human. This is what had had me open another issue once already (#131), where you fixed it by leveraging a /tmp file.
I also found that running with --cap-default
fixes it. If you decide this is a wont-fix, I could personally live with using this flag, but I assume a proper error message would be great, at least.
In the end, I am not sure what exactly is causing this but I hope this will help you. Cheers
It also did not work after a chmod -R o+rX ~/.cache
, however one warning came up during this: chmod: changing permissions of '.cache/x11docker/x11docker-xfce-xfce4-terminal-55668200901/share/tini': Operation not permitted
.
This is what the respective share
folder looks like:
-rw-r--r-- 1 phi phi 3054 Nov 20 14:14 container.CMD.sh
-rw-r--r-- 1 phi phi 4412 Nov 20 14:14 containerrootrc
-rw-r--r-- 1 phi phi 4 Nov 20 14:14 container.user
-rw-rw-rw- 1 phi phi 293 Nov 20 14:14 environment
-rw-r--r-- 1 phi phi 2 Nov 20 14:14 exitcode
-rw-r--r-- 1 phi phi 0 Nov 20 14:14 journalctl.log
prw-rw-rw- 1 phi phi 0 Nov 20 14:14 message.fifo
-rw-rw-rw- 1 phi phi 704 Nov 20 14:14 stderr
-rw-rw-rw- 1 phi phi 0 Nov 20 14:14 stdout
-rw-r--r-- 1 phi phi 51 Nov 20 14:14 timetosaygoodbye
prw-rw-r-- 1 phi phi 0 Nov 20 14:14 timetosaygoodbye.fifo
-rwxr-xr-x 1 root root 0 Nov 20 14:14 tini
-rw-r--r-- 1 phi phi 69315 Nov 20 14:14 x11docker.log
-rw-r--r-- 1 phi phi 104 Nov 20 14:14 Xclientcookie
which looks fine I think (?)
Thanks for your investigation!
Part of the problem (seeing that this issue hasnt been opened yet) may be that I set the permissions on ~/.local and most other folders in $HOME to drwx------ 700. I think it makes the most sense as the contents of this folder arent any other user's business, be it system or human. This is what had had me open another issue once already (#131), where you fixed it by leveraging a /tmp file.
Ah, yes! I already thought that this bug looks somehow familar. So I reintroduced it.
As a first fix that maybe helps I've changed the file permission of x11docker.log
to 666
. I doubt that this is enough, but it would be nice if you try it out.
I also found that running with --cap-default fixes it.
The point is that x11docker runs the container with docker option --cap-drop=ALL
. That disallows a lot of root privileges. But containerrootrc
tries to access x11docker.log
as root in container and cannot supersede your 700
settings on host.
Adding the capability to supersede access permissions should solve the issue (compare man capabilities
):
x11docker -- --cap-add DAC_OVERRIDE -- x11docker/xfce xfce4-terminal
I am currently not sure what would be the best solution.
DAC_OVERRIDE
regardless of the setup. Easy, but a security impact on systems that would not need it.700
and only adding DAC_OVERRIDE
in that case. Adds some complexity in the code.chmod -R o+rX ~/.cache
Sorry for asking, but what exactly does that do?
I tried using the latest commit, still no change.
Nope, DAC_OVERRIDE on its own does not cut it, only cap-default.
I am not very sure on to the details of this. But option 1 and 2 do not sound quite elegant. And it would make sense to prevent root files in the host environment. Those are always a bummer when it comes to archives, searches, deletions etc. Like when you want to backup your home folder but get errors somewhere deep nested inside .local/share/x11docker, as it contains root files. Option 3 makes the most sense to me, but this is only intuition.
chmod -R o+rX ~/.cache
Sorry for asking, but what exactly does that do?
No worries – it means recursively giving read access (+4) to "other" users (not owner or group) to everthing inside .cache. X
stands for folder execution rights (+1), so they will be able to list folder contents, but it does not apply to files (so no scripts can be run etc). +
means "adding permissions", so if any were present beforehand, those are combined. To set them, use =
.
Ok wait, the recent commit might have solved this (but everything is kinda laggy now). Just so you dont spend too much unnecessary digging here. I'll get back to you later.
Ok wait, the recent commit might have solved this (but everything is kinda laggy now).
Good! Maybe there is another timeout now. Please give me a fresh logfile and I'll look through it.
And it would make sense to prevent root files in the host environment. Those are always a bummer when it comes to archives, searches, deletions etc.
x11docker already avoids root files in HOME
. root only appends lines to existing files. That does not change the file ownership. But this seems to be an issue with 700
for ~/.cache
I am surprised why you have a root-owned tini
in the share
folder:
-rwxr-xr-x 1 root root 0 Nov 20 14:14 tini
It should not be there. It is shared with the container with:
--volume '/usr/bin/docker-init':'/usr/local/bin/tini':ro \
In my own x11docker cache tini does not appear.
With latest commit:
With previous commit, before your latest fix (at 25b8034f24238eed267784def8f2e33de5a489d1), everything seems the same as above. So nothing changed.
Here's a new verbose log file as requested, from latest commit's ./x11docker -v --xephyr x11docker/xfce xfce4-terminal
, run from terminal: https://waritschlager.de/share/f43f639000f6f3c6.txt
Nothing seems to have changed much
* but is slow, like really laggy, vscode editor is pretty much unusable. But somehow now also older versions of x11docker are as slow, when they were smooth a few days ago(?!) I will investigate this further, you may consider those cases working perfectly for now
I am surprised why you have a root-owned
tini
in theshare
folder
I thought I'd try to reproduce it by deleting the respective cache folder. But now when I run it (with cap-default), no folder is recreated inside .cache/x11docker anymore, so I cannot give you any more details.
Okay, regarding the lagginess: This might not be related to this issue at all, but I'll post the info here anyway:
The lags got introduced with 5a35b8107ca043d2f0dd8a2fafe97157164bbc5f. Note that this is an earlier one than the one that broke everything described above. I only encounter those lags with Xephyr. The application I tested this with is xfce, with vscode as the application running inside. I dont know what is so special about it, but thunar for instance was not lagging. But maybe it's just not as resource intensive. Please tell me if you want more info here.
While digging, I also realized that nxagent and hostdisplay seem to behave more fluent than xephyr and xpra. I dont know how nxagent works, but with hostdisplay it makes sense as no key presses are proxied.
Thank you for your detailed investigation!
The lags got introduced with 5a35b81. Note that this is an earlier one than the one that broke everything described above. I only encounter those lags with Xephyr. The application I tested this with is xfce, with vscode as the application running inside. I dont know what is so special about it, but thunar for instance was not lagging.
Finally an issue that I can fix easily! In the commit you found I've enabled Xephyr option -glamor
. From Xephyr -help
:
-glamor Enable 2D acceleration using glamor
glamor should help to speed up some things, but appearently it can be problematic. I've disabled it yet. --xephyr
should not be laggy anymore.
While digging, I also realized that nxagent and hostdisplay seem to behave more fluent than xephyr and xpra
--hostdisplay
is the fastest option because no additional X server is involved. Unfortunately it costs some container isolation. A malicious application could access your host system.
--xpra
is the slowest option. But it is a preferred default of x11docker because it provides some nice features, e.g. graphical clipboard support.
Furthermore it is the only seamless solution for --gpu
beside the insecure --hostdisplay
.
However, if security/container isolation is not a concern, --hostdisplay --gpu
is the fastest setup with the lowest overhead.
I thought I'd try to reproduce it by deleting the respective cache folder. But now when I run it (with cap-default), no folder is recreated inside .cache/x11docker anymore, so I cannot give you any more details.
The cache folder only exists while the container is running. If you don't have a cache folder while x11docker is running, something very basical goes wrong. Your log shows valid pathes:
/home/phi/.cache/x11docker/x11docker-xfce-xfce4-terminal-44151347946
I just tried to reproduce the 700
issue with chmod -R 700 ~/.cache/x11docker
.
Surprisingly I have no issues at all and cannot reproduce your issue. x11docker just starts up well. I'll look closer how to reproduce it.
Started from a noninteractive shell, with no special options (no cap-default), using a xfce hotkey shortcut on the host system: Works fine*. I did not expect that. It is the opposite behavior of a past issue #177 where it was the shortcut action that failed. - I checked the user from such a shortcut job, it is also the normal one, "phi". Maybe a tty issue again?
That's odd. I doubt that it is a tty issue again. I would see that in the log.
Maybe you have two x11docker on your system, e.g. one in a cloned git folder and one in /usr/bin
?
Here's a new verbose log file as requested, from latest commit's ./x11docker -v --xephyr x11docker/xfce xfce4-terminal, run from terminal: https://waritschlager.de/share/f43f639000f6f3c6.txt Nothing seems to have changed much
The log shows version 6.4.0
, but the master version with recent commits should show:
$ x11docker --version
6.4.1-beta
Huh, sorry - I dont know how the version mismatch happened. I redid it with the latest commit from yesterday, for sure this time: https://waritschlager.de/share/32492a8420106529.txt
That's odd. I doubt that it is a tty issue again. I would see that in the log. Maybe you have two x11docker on your system, e.g. one in a cloned git folder and one in /usr/bin?
No, this is not the case. xfce shortcut and interactive bash definitely behave differently (one working, the other not). I removed all the times and pids from it with some wild regex and skipped display numbers etc. and below are the notable log output differences when run as xfce shortcut. As expected, the only real difference seems to be that the container.log
permission error is gone.
8a9
> DEBUGNOTE: check_host(): Command tty failed. Guess if running on console: no
45a47
> DEBUGNOTE: check_host(): Command tty failed. Guess if running on console: no
124c126
< Running in a terminal: yes
---
> Running in a terminal: no
371,377d372
< grep -x -q 'x11docker/xfce' < /home/phi/.cache/x11docker/docker.imagelist || grep -x -q 'x11docker/xfce:latest' < /home/phi/.cache/x11docker/docker.imagelist || {
< docker inspect x11docker/xfce >>/home/phi/.cache/x11docker/x11docker-xfce-/share/container.log 2>&1 || {
< echo 'Image x11docker/xfce not found locally.' >&2
< echo 'Do you want to pull it from docker hub?' >&2
< askyesno && Dockerpull=yes || error "Image 'x11docker/xfce' not available locally and not pulled from docker hub."
< }
< }
382a378
> env DISPLAY=':0.0' DBUS_SESSION_BUS_ADDRESS='unix:path=/run/user/1000/bus' bash -c "notify-send 'x11docker: Pulling image x11docker/xfce from docker hub'" 2>/dev/null &
1475c1471,1495
< /tmp/containerrootrc: 11: /tmp/containerrootrc: cannot create /x11docker/container.log: Permission denied
---
> mkdir: created directory '/var/run/dbus'
> mkdir: created directory '/tmp/.ICE-unix'
> mkdir: created directory '/tmp/.X11-unix'
> mkdir: created directory '/tmp/.font-unix'
> srwxrwxrwx 1 1000 1001 0 Nov 23 08:10 /X113
>
> ==> /home/phi/.cache/x11docker/x11docker-xfce-/message.log <==
> DEBUGNOTE: Running containerrootrc: Setup as root in container
>
> ==> /home/phi/.cache/x11docker/x11docker-xfce-/share/container.log <==
> lrwxrwxrwx 1 root root 5 Nov 23 08:10 /tmp -> /X113
> mkdir: created directory '/fakehome'
>
> ==> /home/phi/.cache/x11docker/x11docker-xfce-/message.log <==
> DEBUGNOTE: containerrootrc: Container libc: glibc
>
> ==> /home/phi/.cache/x11docker/x11docker-xfce-/share/container.log <==
> removed '/etc/shadow'
>
> ==> /home/phi/.cache/x11docker/x11docker-xfce-/message.log <==
> x11docker: Container system ID: debian
>
>
> ==> /home/phi/.cache/x11docker/x11docker-xfce-/share/container.log <==
> chown: changing ownership of '/tmp/chowntestfile': Operation not permitted
1551d1572
< DEBUGNOTE: waitforlogentry(): tailstderr: Found log entry "x11docker=ready" in store.info.
1553,1592c1574,1586
< DEBUGNOTE: waitforlogentry(): containerrc: Waiting since 11s for log entry "containerrootrc=ready" in store.info
< DEBUGNOTE: waitforlogentry(): containerrc: Waiting since 12s for log entry "containerrootrc=ready" in store.info
...
---
> DEBUGNOTE: waitforlogentry(): tailstderr: Found log entry "x11docker=ready" in store.info.
> DEBUGNOTE: waitforlogentry(): containerrc: Found log entry "containerrootrc=ready" in store.info.
So it fails here: https://github.com/mviereck/x11docker/blob/master/x11docker#L4282. $(convertpath share $Containerlogfile)
, which resolves to /x11docker/container.log
isnt accessible, because /x11docker
itself cannot be traversed into. I put an ls -l /
at that position and here is the output:
total 16
srwxrwxrwx 1 1000 1001 0 Nov 23 08:57 X120
drwxr-xr-x 2 root root 4096 Jul 14 08:49 bin
drwxr-xr-x 2 root root 6 May 13 2019 boot
drwxr-xr-x 5 root root 360 Nov 23 08:57 dev
drwxr-xr-x 40 root root 4096 Nov 23 08:57 etc
drwxr-xr-x 2 root root 6 May 13 2019 home
drwxr-xr-x 8 root root 107 Jul 14 08:49 lib
drwxr-xr-x 2 root root 34 Jul 8 03:30 lib64
drwxr-xr-x 2 root root 6 Jul 8 03:30 media
drwxr-xr-x 2 root root 6 Jul 8 03:30 mnt
drwxr-xr-x 2 root root 6 Jul 8 03:30 opt
dr-xr-xr-x 374 root root 0 Nov 23 08:57 proc
drwx------ 2 root root 37 Jul 8 03:30 root
drwxr-xr-x 3 root root 60 Nov 23 08:57 run
drwxr-xr-x 2 root root 4096 Jul 8 03:30 sbin
drwxr-xr-x 2 root root 6 Jul 8 03:30 srv
dr-xr-xr-x 13 root root 0 Nov 23 08:57 sys
drwxrwxrwt 2 root root 29 Nov 23 08:57 tmp
drwxr-xr-x 10 root root 105 Jul 8 03:30 usr
drwxr-xr-x 11 root root 139 Jul 8 03:30 var
drwxrwx--- 2 1000 1001 4096 Nov 23 08:57 x11docker
/tmp
ls: cannot access '/x11docker/container.log': Permission denied
and $PWD
is /tmp
and $USER
is phi
and $UID
is empty and id
says uid=0(root) gid=0(root) groups=0(root)
and cat /etc/passwd
also does not contain phi
. So the user $USER
doesnt exist..?!
On my host system, the the $UID
is (as on most systems) 1000.
Thanks for removing the glamor option! Everything is smooth again.
The cache folder only exists while the container is running.
Oh, huh. The cache folders I described above were present without any container running. So I guess they were leftovers from failed run attempts. Should not matter, this doesnt happen anymore right now.
No, this is not the case. xfce shortcut and interactive bash definitely behave differently (one working, the other not).
That's really odd. I have no good idea why there is a difference. That also indicates that 700
is not the core issue. As been said, I cannot reproduce the issue if I set my own cache to 700
.
The only idea I have yet is some sort of >>
redirection issue. But I would not know why it only happens in terminal, but not with a shortcut.
echo "exec 1>>$(convertpath share $Containerlogfile) 2>&1"
Could you try to just disable this line? It only serves to redirect some output into the logfile, so it would not hurt essentially. Though, ls
fails, too:
ls: cannot access '/x11docker/container.log': Permission denied
$PWD is /tmp and $USER is phi and $UID is empty and id says uid=0(root) gid=0(root) groups=0(root) and cat /etc/passwd also does not contain phi. So the user $USER doesnt exist..?!
The entries in /etc/passwd
and etc/group
are done shortly after that in containerrootrc
.
The cache folders I described above were present without any container running. So I guess they were leftovers from failed run attempts. Should not matter, this doesnt happen anymore right now.
I also get those leftover folders. It seems x11docker does not get enough time to clean up if I shut down the system while x11docker is running. You can use sudo x11docker --cleanup
to remove all leftovers (and currently running x11docker sessions).
I did a test in an old Manjarao VM.
I've set ~/.cache/x11docker
to 700. It works well.
I could not update Manjaro due to some package conflicts. But I assume that would not change anything.
So i cannot reproduce your issue and have no idea why x11docker fails on your system in terminal only.
Would it be ok for you to just use --cap-default
and close the ticket?
Though, if you have an idea and want to investigate further, I am happy to look at, too.
Sure. I'll get back to you when I come accross a meaningful hint. Thank you for the help!
The latest commit runs containerrootrc
with flag --privileged
. You should not need --cap-default
anymore.
x11docker's root setup in container now has no restrictions. This allow to drop --cap-default
and the container command will run without privileges at all.
It would be nice if you try this out.
yup, works now out of the box :-) good job
Great! :-)
Finally solved although we did not find the very special point where it previously failed.
Hi! Not really sure whats happening here. Happens with either of xephyr, xpra, nxagent and hostidsplay. As if the docker image doesnt even get started. Here is a verbose log output upload: https://waritschlager.de/share/922650f43f77a6fe.txt
Seems to have broken with a x11docker update of last few weeks/months, as I did not have any such problems beforehand. I can cycle through recent commits to find the culprit tomorrow.
until then, LG