mviereck / x11docker

Run GUI applications and desktops in docker and podman containers. Focus on security.
MIT License
5.62k stars 378 forks source link

--interactive fails on Ubuntu 18.04 #134

Closed gidfiddle closed 5 years ago

gidfiddle commented 5 years ago

The bash script x11docker sometimes invokes sh or /bin/sh rather than bash or /bin/bash. On a host system such as Fedora 29, /bin/sh is the same as /bin/bash, but on an Ubuntu 18.04 host, /bin/sh is /bin/dash, which is different from /bin/bash.

This fact causes x11docker 5.4.5-beta to behave differently for Fedora 29 and Ubuntu 18.04 hosts (where Fedora has docker 18.06.0-dev and Ubuntu has docker 18.06.1-ce). For example, on a Fedora 29 host, the behavior of

    x11docker -it --nxagent ubuntu:bionic /bin/bash

is similar to that of

    docker run -it ubuntu:bionic /bin/bash

(both running an interactive bash session in the Ubuntu container). In contrast, on an Ubuntu 18.04 host, the x11docker command hangs. (It is not responsive to input, and it gives no output. If it is put into the background, it can be accessed using a docker exec command, say docker exec -it <container> ls /, but for the command docker exec -it <container> /bin/bash, the shell running in the container gives a prompt and then immediately exits.)

mviereck commented 5 years ago

Thank you for the report!

I can confirm that interactive mode fails in an Ubuntu 18.04 VM with docker version 18.06.1-ce.

However, the issue won't be a bash/dash/sh difference. x11docker generates some sh scripts, but for use in container only. Scripts executed on host use bash.

I'll investigate.

gidfiddle commented 5 years ago

I thought I had a demonstration of different behavior when I replace dash by bash, but I must have made a mistake. Also, now I cannot reproduce my last statement: "but for the command docker exec -it /bin/bash, the shell running in the container gives a prompt and then immediately exits".

You're right that the central problem is that --interactive fails on Ubuntu.

mviereck commented 5 years ago

It seems this bug was introduced with the fix for #133. Please replace https://github.com/mviereck/x11docker/blob/master/x11docker#L4284

 echo '  [ "$Pid1pid" ] && break'

with:

 echo '  [ "$Pid1pid" ] && [ "$Pid1pid" != "0" ] && break'

docker sometimes gives 0 as result for PID 1, and sometimes an empty string. Maybe it depends on the docker version.

gidfiddle commented 5 years ago

Well, certainly you found and fixed a bug, but for me the problem with --interactive remains.

I added your fix to two versions of x11docker (git commits 00ee566 and 92f1283). For both, the command

x11docker -it --nxagent ubuntu:bionic /bin/bash

hangs as I described before, despite the log report that "Host PID of container PID 1: 617".

mviereck commented 5 years ago

I cannot reproduce the bug in Ubuntu 18.04 VM now.

Can you provide me x11docker.log at pastebin.com, please?

gidfiddle commented 5 years ago

On an Ubuntu 18.04 machine, I ran

x11docker -it --nxagent ubuntu:bionic /bin/bash

where x11docker is from git commit 00ee566 with your fix added. It hung. I stopped it using the command

docker stop x11docker_X100_cc4106_ubuntu

The x11docker.log file is at

    https://pastebin.com/S0BTpwTi
mviereck commented 5 years ago

The log does not show an obvious issue. bash is running. Now I am curious. Can you try with sh instead of bash, please? Which terminal emulator do you use? Can you try with xterm, please? Can you try without --nxagent?

Side note: Just for clarification: -it --nxagent is short for --interactive --tty --nxagent. --tty just means 'no X server' for x11docker. --nxagent afterwards supersedes this. Other way around --nxagent -it would not run nxagent.

gidfiddle commented 5 years ago

Can you try with sh instead of bash, please? Well,

x11docker -it --nxagent ubuntu:bionic /bin/sh

behaves the same as with /bin/bash. But let me explain a bit more about that behavior.

Using ^Z and bg I can put the this command into the background. It continues running a container in the background, as seen from docker ps. Importantly, it does not stop itself, as it would if it were listening for input. And x11docker -it --nxagent ubuntu:bionic ls / produces no output, unlike docker run -it ubuntu:bionic ls /. Thus the /bin/sh does not seem to be connected to input or output. In fact, it doesn't play any role, because running x11docker -it --nxagent ubuntu:bionic & does the same job of providing a running Ubuntu OS that can be accessed using docker exec -it <container> /bin/sh.

Should we be using the --stdin option on x11docker or give an --attach option to the docker run that it invokes?

Which terminal emulator do you use? Can you try with xterm, please?

On the host I'm running Gnome, and I'm launching from a gnome-terminal. Launching from an xterm does not affect the behavior.

Can you try without --nxagent?

Well, x11docker -it ubuntu:bionic /bin/bash and x11docker -it ubuntu:bionic behave the same as with --nxagent. On the other hand, x11docker -i ubuntu:bionic /bin/sh runs the Xpra server, which fails to start up roughly 50% of the time. x11docker -it --xephyr ubuntu:bionic /bin/sh works the same as with --nxagent.

Side note: ...

Thanks for the clarification. I didn't realize that the option order would matter.

P.S. I very much appreciate your help on this problem, and I'm happy to keep running experiments at your direction as long as you want. I understand that it's onerous to debug a problem that you cannot reproduce on your own computer. Hmm, perhaps I'll test with Ubuntu running on different hardware.

gidfiddle commented 5 years ago

I put a fresh copy of Ubuntu 18.04 on a different (newer) computer, downloaded docker, and installed the 00ee566 commit of x11docker with your fix. On this platform, x11docker --it works as it should!

It seems that the problem I'm having with --interactive is peculiar to my older computer.

Sorry! I hope I didn't waste too much of your time.

I have yet to upgrade the Ubuntu software on the newer computer (its software installation is not yet very close to that on the older computer). Presumably x11docker will continue to work properly after the upgrade. Then I will reinstall Ubuntu on the older computer to see whether that fixes the problem.

Again, thanks for your help. Your software will play a critical role in allowing me and my international colleagues to deploy our mathematical fluid dynamics graphics program to other researchers. (We'll be sure to cite your contribution.)

mviereck commented 5 years ago

I put a fresh copy of Ubuntu 18.04 on a different (newer) computer, downloaded docker, and installed the 00ee566 commit of x11docker with your fix. On this platform, x11docker --it works as it should!

Good to know, thank you for the test! My Ubuntu 18.04 VM was quite old, I've made an update, still no issue here.

The latest commit in master contains a fixed PID1-check, you won't need to add it yourself now.

It seems that the problem I'm having with --interactive is peculiar to my older computer.

It is still odd. TTY input/output is some basic procedure that should not behave different on different computers or system versions. And if I understood you correctly, a docker run -ti ubuntu:bionic bash does work on that machine? Maybe just wait some time, it might take some time until the interactive prompt appears.

Sorry! I hope I didn't waste too much of your time.

No worry! There has been one real bug causing this issue. Why it still fails on one particular computer is a remaining riddle.

Your software will play a critical role in allowing me and my international colleagues to deploy our mathematical fluid dynamics graphics program to other researchers. (We'll be sure to cite your contribution.)

That is a honour, thanks! I finally should submit a paper to JOSS to make x11docker citeable. This is discussed in #92. Would you mind to have a look at paper.md and give a feedback whether it is ok or needs a change?

Edit:

On the other hand, x11docker -i ubuntu:bionic /bin/sh runs the Xpra server, which fails to start up roughly 50% of the time.

It might have been a false error message due to a race condition if you terminated x11docker faster than xpra server was ready. The latest commit should avoid the error message for this case.