mviereck / x11docker

Run GUI applications and desktops in docker and podman containers. Focus on security.
MIT License
5.62k stars 378 forks source link

x11docker how to restart my docker #433

Closed showfuture closed 2 years ago

showfuture commented 2 years ago

when I run this code "read Xenv < <(x11docker --gpu --runtime=nvidia --hostdisplay --showenv --home=/work_log --pull=yes --name=unity3d-job unity3d-job:20220407-28)", my container always down, how can I auto restart my container!

showfuture commented 2 years ago

like docker run -d --restart=always

mviereck commented 2 years ago

There is no option to automatically restart x11docker. If your application crashes, it makes sense to check and fix the issue that leads to a crash.

You could run a loop like:

while :; do
  read Containerid < <(x11docker --showid --gpu --runtime=nvidia --hostdisplay --home=/work_log --pull=yes --name=unity3d-job unity3d-job:20220407-28
  docker logs -f $Containerid >/dev/null 2>/&1
done

Or just:

while :; do
  x11docker --showid --gpu --runtime=nvidia --hostdisplay --home=/work_log --pull=yes --name=unity3d-job unity3d-job:20220407-28
done
showfuture commented 2 years ago
==> /root/.cache/x11docker/registry-k8s-cnbeijing-yidianshihui-com--14723385505/message.log <==
DEBUGNOTE[15:00:01,587]: storeinfo(): containerrootrc=ready
DEBUGNOTE[15:00:01,642]: storeinfo(): dockerrc=ready
DEBUGNOTE[15:00:01,663]: storepid(): Stored pid '11081' of 'dockerstopshell': 11081 ?        00:00:00 bash
DEBUGNOTE[15:00:01,665]: waitforlogentry(): start_docker(): Waiting for logentry "dockerrc=ready" in store.info
DEBUGNOTE[15:00:01,672]: waitforlogentry(): start_docker(): Found log entry "dockerrc=ready" in store.info.
DEBUGNOTE[15:00:01,674]: storeinfo(): xtermrc=ready
DEBUGNOTE[15:00:01,678]: watchpidlist(): Setting pid 10637 on watchlist: pid1pid
DEBUGNOTE[15:00:01,698]: watchpidlist(): Watching pids:
10637 ?        00:00:00 init
DEBUGNOTE[15:00:01,700]: storepid(): Stored pid '10637' of 'pid1pid': 10637 ?        00:00:00 init
DEBUGNOTE[15:00:01,703]: waitforlogentry(): containerrc: Found log entry "containerrootrc=ready" in store.info.
DEBUGNOTE[15:00:01,716]: Running containerrc: Unprivileged user commands in container
DEBUGNOTE[15:00:01,723]: Process tree of container: (maybe not complete yet)
init(10637)---sh(10658)
x11docker[15:00:01,753]: containerrc: Container system:
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

DEBUGNOTE[15:00:01,770]: containerrc: HOME is not empty. Not copying from /etc/skel
DEBUGNOTE[15:00:01,774]: Process tree of x11docker:
bash(7635)-+-bash(8122)---tail(8125)
           |-bash(8124)---sleep(10860)
           |-bash(8126)---sleep(10854)
           |-bash(8145)
           |-bash(8159)
           |-bash(9215)---bash(11209)---pstree(11211)
           `-bash(10477)
  Lost child of dockerrc (dockerstopshell):
    bash(11081)
DEBUGNOTE[15:00:01,779]: storeinfo(): Stored info:
cache=/root/.cache/x11docker/registry-k8s-cnbeijing-yidianshihui-com--14723385505
stdout=/root/.cache/x11docker/registry-k8s-cnbeijing-yidianshihui-com--14723385505/share/stdout
stderr=/root/.cache/x11docker/registry-k8s-cnbeijing-yidianshihui-com--14723385505/share/stderr
x11dockerpid=7635
xserver=--hostdisplay
DISPLAY=:0
XSOCKET=/tmp/.X11-unix/X0
XDG_RUNTIME_DIR=/run/user/0
Xenv= DISPLAY=:0 XSOCKET=/tmp/.X11-unix/X0 XDG_RUNTIME_DIR=/run/user/0
tini=/usr/bin/docker-init
containername=unity3d-job
runtime=nvidia
containeruser=root
readyforX=ready
xinitrc=ready
containerid=fbb8620b3399afb9bec10f8762ddb9dc3a527a31f9bf9811a06a6f745ac5314a
pid1pid=10637
containerip=
containerrootrc=ready
dockerrc=ready
xtermrc=ready
DEBUGNOTE[15:00:01,785]: storepid(): Stored pids:
8145 watchpidlist
8159 watchmessagefifo
9215 containershell
11081 dockerstopshell
10637 pid1pid
DEBUGNOTE[15:00:01,794]: storeinfo(): x11docker=ready
x11docker[15:00:01,817]: Container environment:
BUILD_VERSION=20220407-29
DISPLAY=:0
HOME=/home/root
HOSTNAME=k8s-cnbeijing-extra-001
OLDPWD=/tmp
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/home/root
TERM=xterm
TZ=UTC-08
UNITY_3D_LOG_DIR=/home/root
USER=root
XDG_RUNTIME_DIR=/tmp/XDG_RUNTIME_DIR
XDG_SESSION_TYPE=x11
container=docker

DEBUGNOTE[15:00:01,831]: cmdrc: Running container command:
  '/bin/sh' '-c' '/unity-build -logfile  /home/root/unity3d-job-20220407-29.log'

DEBUGNOTE[15:00:05,019]: waitforlogentry(): tailstderr: Waiting since 78s for log entry "x11docker=ready" in store.info
DEBUGNOTE[15:00:05,021]: waitforlogentry(): tailstdout: Waiting since 78s for log entry "x11docker=ready" in store.info
DEBUGNOTE[15:00:05,030]: waitforlogentry(): tailstderr: Found log entry "x11docker=ready" in store.info.
DEBUGNOTE[15:00:05,029]: waitforlogentry(): tailstdout: Found log entry "x11docker=ready" in store.info.
DEBUGNOTE[10:38:34,774]: watchpidlist(): PID 10637 has terminated
DEBUGNOTE[10:38:34,777]: time to say goodbye (watchpidlist 10637)
DEBUGNOTE[10:38:34,781]: time to say goodbye (watchpidlist)
DEBUGNOTE[10:38:34,782]: time to say goodbye (main)
DEBUGNOTE[10:38:34,785]: Terminating x11docker.
DEBUGNOTE[10:38:34,788]: time to say goodbye (finish)
DEBUGNOTE[10:38:34,813]: finish(): Checking pid 10637 (pid1pid): (already gone)
DEBUGNOTE[10:38:34,844]: finish(): Checking pid 11081 (dockerstopshell): 11081 ?        00:00:00 bash
DEBUGNOTE[10:38:34,870]: finish(): Checking pid 9215 (containershell): (already gone)
DEBUGNOTE[10:38:34,898]: finish(): Checking pid 8159 (watchmessagefifo):  8159 ?        00:00:00 bash
DEBUGNOTE[10:38:34,926]: finish(): Checking pid 8145 (watchpidlist): (already gone)
DEBUGNOTE[10:38:35,010]: Removing container unity3d-job
    Error: No such container: unity3d-job
DEBUGNOTE[10:38:36,051]: termpid(): Terminating 8159 (watchmessagefifo):  8159 ?        00:00:00 bash

I found this log, I see ' watchpidlist(): PID 10637 has terminated' and 'Removing container unity3d-job', Why did it happen

mviereck commented 2 years ago

I found this log, I see ' watchpidlist(): PID 10637 has terminated' and 'Removing container unity3d-job', Why did it happen

It seems that the container starts, but your container application exits very soon. In that case x11docker terminates, too. PID 10637 is mentioned above as pid1pid=10637, so it is the first pid of the container. Do you get any terminal output from your application?

Btw., it seems that you use an old x11docker version. For further debugging, please update with x11docker --update or x11docker --update-master.

showfuture commented 2 years ago

when I run this shell:

#!/bin/bash
while true
do

        read xenv < \<\( /usr/bin/x11docker --gpu --runtime=nvidia --hostdisplay --showenv --home=/extra_work_log --pull=yes --name=$DOCKER_NAME $REGISTRY_PULL_HOST/bigdata/unity3d-job:$BUILD_VERSION\)
done

it's show wrong message :

/root/deploy/test.sh:行6: <(/usr/bin/x11docker: 没有那个文件或目录

how to fix this err

mviereck commented 2 years ago

You should not escape the code:

\<\(
\)

Without \ it should work:

#!/bin/bash
while true
do

        read xenv < <( /usr/bin/x11docker --gpu --runtime=nvidia --hostdisplay --showenv --home=/extra_work_log --pull=yes --name=$DOCKER_NAME $REGISTRY_PULL_HOST/bigdata/unity3d-job:$BUILD_VERSION)
done

Note: You'll run an infinite number of containers with this loop! x11docker is moved to background, so the loop will immediately run x11docker again.

However, rather than using a loop find out why your application crashes and fix that. Does it print any error messages?


Did you update x11docker? The current version is 7.1.4, compare x11docker --version.

showfuture commented 2 years ago

my completed shell script is :

#!/bin/bash

DOCKER_NAME=$1
REGISTRY_PULL_HOST=$2
STATUS=$3
BUILD_VERSION=$4

echo "$DOCKER_NAME"
echo "$REGISTRY_PULL_HOST"
echo "$STATUS"
echo "$BUILD_VERSION"

while true
do
    echo "a"
    if [[ $STATUS == "stop" ]];then
    echo "111ss"
        sh /root/deploy/stop.sh
    echo "stopped"
        break
    elif [[ -n $(docker ps -q -f "name=$DOCKER_NAME") ]];then
    echo "ssss"
        echo -e "`date` \ncontainer $DOCKER_NAME is running" > /extra_work_log/deploy.log
    else
    echo "222sss"
        echo "read Xenv < <(x11docker --gpu --runtime=nvidia --hostdisplay --showenv --home=/extra_work_log --pull=yes --name=$DOCKER_NAME $REGISTRY_PULL_HOST/bigdata/unity3d-job:$BUILD_VERSION)"
        read xenv < \<\( /usr/bin/x11docker --gpu --runtime=nvidia --hostdisplay --showenv --home=/extra_work_log --pull=yes --name=$DOCKER_NAME $REGISTRY_PULL_HOST/bigdata/unity3d-job:$BUILD_VERSION\)

        echo -e "`date` \ncontainer $DOCKER_NAME has stopped"  >> /extra_work_log/deploy.log
    fi
    sleep 20
done

so don't worry about "You'll run an infinite number of containers with this loop!"

but if I don't escape the code:

/root/deploy/test.sh:行27: 未预期的符号 `<' 附近有语法错误
/root/deploy/test.sh:行27: `        read xenv < <(/usr/bin/x11docker --gpu --runtime=nvidia --hostdisplay --showenv --home=/extra_work_log --pull=yes --name=$DOCKER_NAME $REGISTRY_PULL_HOST/bigdata/unity3d-job:$BUILD_VERSION)'

so what can I do?

I don't update x11docker, if I update x11docker, I must change mysql total code! my x11docker version is "6.10.0"

showfuture commented 2 years ago

my completed shell script is :

#!/bin/bash

DOCKER_NAME=$1
REGISTRY_PULL_HOST=$2
STATUS=$3
BUILD_VERSION=$4

echo "$DOCKER_NAME"
echo "$REGISTRY_PULL_HOST"
echo "$STATUS"
echo "$BUILD_VERSION"

while true
do
    echo "a"
    if [[ $STATUS == "stop" ]];then
    echo "111ss"
        sh /root/deploy/stop.sh
    echo "stopped"
        break
    elif [[ -n $(docker ps -q -f "name=$DOCKER_NAME") ]];then
    echo "ssss"
        echo -e "`date` \ncontainer $DOCKER_NAME is running" > /extra_work_log/deploy.log
    else
    echo "222sss"
        echo "read Xenv < <(x11docker --gpu --runtime=nvidia --hostdisplay --showenv --home=/extra_work_log --pull=yes --name=$DOCKER_NAME $REGISTRY_PULL_HOST/bigdata/unity3d-job:$BUILD_VERSION)"
        read xenv < \<\( /usr/bin/x11docker --gpu --runtime=nvidia --hostdisplay --showenv --home=/extra_work_log --pull=yes --name=$DOCKER_NAME $REGISTRY_PULL_HOST/bigdata/unity3d-job:$BUILD_VERSION\)

        echo -e "`date` \ncontainer $DOCKER_NAME has stopped"  >> /extra_work_log/deploy.log
    fi
    sleep 20
done

so don't worry about "You'll run an infinite number of containers with this loop!"

but if I don't escape the code:

/root/deploy/test.sh:行27: Unexpected symbol `<' Syntax error nearby
/root/deploy/test.sh:行27: `        read xenv < <(/usr/bin/x11docker --gpu --runtime=nvidia --hostdisplay --showenv --home=/extra_work_log --pull=yes --name=$DOCKER_NAME $REGISTRY_PULL_HOST/bigdata/unity3d-job:$BUILD_VERSION)'

so what can I do?

I don't update x11docker, if I update x11docker, I must change mysql total code! my x11docker version is "6.10.0"

mviereck commented 2 years ago

Maybe you run the script with sh script.sh? Than the script would be executed by sh instead of bash. The syntax with < <(...) is specific to bash, but does not work in sh.

Try to run the command without \ in a bash shell.