checkpoint-restore / criu

Checkpoint/Restore tool
criu.org
Other
2.86k stars 578 forks source link

cannot checkpoint a vnc server container #2459

Open coldbloodx opened 1 month ago

coldbloodx commented 1 month ago

Description vnc server container cannot be checkpointed.

Steps to reproduce the issue:

  1. create a vnc server image with ubuntu 24.04 base image just install below packages: apt install -y xfce4 xfce4-session xfce4-terminal tightvncserver xauth
  2. init vnc server related files, password and xstartup script, here is my xstart script
    
    [root@laworks .vnc]# cat xstartup
    #!/bin/sh

xrdb "$HOME/.Xresources" xsetroot -solid grey x-terminal-emulator -geometry 80x24+10+10 -ls -title "$VNCDESKTOP Desktop" & x-window-manager &

Fix to make GNOME work

export XKL_XMODMAP_DISABLE=1 /etc/X11/Xsession


3.run container with below docker command:

[root@laworks ~]# docker run -d -v /root:/root -e USER=root --net=host --name cstest ubuntuvnc:v1 bash -c 'vncserver; tail -f /dev/null' [root@laworks ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 3c0043e01e67 ubuntuvnc:v1 "bash -c 'vncserver;…" 3 hours ago Up 3 hours cstest


4. connect to the vncserver by vncviewer,  open a terminal in the viewer, then run below commd
![image](https://github.com/user-attachments/assets/c1baf949-3e33-408f-ac56-cf2c4193e61d)

**Describe the results you received:**
checkpoint above vncserver container, get error like below

[root@laworks ~]# docker checkpoint create cstest cp001 Error response from daemon: Cannot checkpoint container cstest: runc did not terminate successfully: exit status 1: criu failed: type NOTIFY errno 0 path= /run/containerd/io.containerd.runtime.v2.task/moby/3c0043e01e6729d2a9745d26bbc2a3baa3faddb4c169cb217d777dc24fe868d0/criu-dump.log: unknown

check criu-dump.log

(00.392350) sockets: Searching for socket 0x79317 family 1 (00.392376) sockets: No filter for socket (00.392378) unix: Dumping unix socket at 6 (00.392379) unix: Dumping: ino 496407 peer_ino 0 family 1 type 1 state 10 name /root/.cache/ibus/dbus-mHfDLJSx (00.392384) unix: Dumped: id 0x29 ino 496407 peer 0 type 1 state 10 name 32 bytes (00.392394) 77977 fdinfo 7: pos: 0 flags: 2/0x1 (00.392402) Error (criu/files-ext.c:94): Can't dump file 7 of that type [600] (anon anon_inode:[pidfd]) ---> this line. (00.392410) ---------------------------------------- (00.392424) Error (criu/cr-dump.c:1674): Dump files (pid: 77977) failed with -1 (00.392434) Waiting for 77977 to trap (00.392453) Daemon 77977 exited trapping (00.392459) Sent msg to daemon 3 0 0

full logs will be attached in this ticket.

**Describe the results you expected:**
container should be checkpointed successfully.

**Additional information you deem important (e.g. issue happens only occasionally):**
below are process info of my vncserver container.
I've tried this case several times, and got none succeed with docker.

[root@laworks ~]# ps -ewf |grep shim root 77840 1 0 11:53 ? 00:00:01 /usr/bin/containerd-shim-runc-v2 -namespace moby -id 3c0043e01e6729d2a9745d26bbc2a3baa3faddb4c169cb217d777dc24fe868d0 -address /run/containerd/containerd.sock root 97514 64766 0 14:34 pts/2 00:00:00 grep --color=auto shim [root@laworks ~]# pstree -cups 77840 systemd(1)───containerd-shim(77840)─┬─tail(77860)─┬─Xtightvnc(77884) │ ├─ibus-daemon(77977)─┬─ibus-dconf(77981)─┬─{ibus-dconf}(77986) │ │ │ ├─{ibus-dconf}(77987) │ │ │ ├─{ibus-dconf}(77989) │ │ │ └─{ibus-dconf}(77991) │ │ ├─ibus-engine-sim(78023)─┬─{ibus-engine-sim}(78024) │ │ │ ├─{ibus-engine-sim}(78025) │ │ │ └─{ibus-engine-sim}(78026) │ │ ├─ibus-extension-(77983)─┬─{ibus-extension-}(77997) │ │ │ ├─{ibus-extension-}(78001) │ │ │ ├─{ibus-extension-}(78004) │ │ │ └─{ibus-extension-}(78013) │ │ ├─ibus-ui-gtk3(77982)─┬─{ibus-ui-gtk3}(77998) │ │ │ ├─{ibus-ui-gtk3}(78000) │ │ │ ├─{ibus-ui-gtk3}(78006) │ │ │ ├─{ibus-ui-gtk3}(78011) │ │ │ └─{ibus-ui-gtk3}(78015) │ │ ├─{ibus-daemon}(77978) │ │ ├─{ibus-daemon}(77979) │ │ └─{ibus-daemon}(77990) │ ├─ibus-x11(77985)─┬─{ibus-x11}(77996) │ │ ├─{ibus-x11}(77999) │ │ └─{ibus-x11}(78005) │ ├─x-window-manage(77894) │ ├─xfce4-terminal(77893)─┬─bash(78020)───sleep(97530) │ │ ├─{xfce4-terminal}(77914) │ │ ├─{xfce4-terminal}(77915) │ │ ├─{xfce4-terminal}(77944) │ │ └─{xfce4-terminal}(78019) │ └─xstartup(77890) ├─{containerd-shim}(77841) ├─{containerd-shim}(77842) ├─{containerd-shim}(77843) ├─{containerd-shim}(77844) ├─{containerd-shim}(77845) ├─{containerd-shim}(77846) ├─{containerd-shim}(77847) ├─{containerd-shim}(77848) ├─{containerd-shim}(77866) ├─{containerd-shim}(78098) └─{containerd-shim}(81087)


**CRIU logs and information:**

<!--
full log: [criu-dump.log](https://github.com/user-attachments/files/16451489/criu-dump.log)
-->

<details><summary>CRIU full dump/restore logs:</summary>
<p>
[criu-dump.log](https://github.com/user-attachments/files/16451457/criu-dump.log)
</p>
</details>
<details><summary>Output of `criu --version`:</summary>
<p>

[root@laworks ~]# criu --version Version: 3.19


</p>
</details>

<details><summary>Output of `criu check --all`:</summary>
<p>

[root@laworks ~]# criu check --all Warn (criu/cr-check.c:1346): Nftables based locking requires libnftables and set concatenations support Looks good but some kernel features are missing which, depending on your process tree, may cause dump or restore failure.



**Additional environment details:**
adrianreber commented 1 month ago

This sounds like it might be solved with #2449.

coldbloodx commented 1 month ago

This sounds like it might be solved with #2449.

checked #2449, it is still in draft state, when will the fix be merged? I'd like to try it immediately.

adrianreber commented 1 month ago

This sounds like it might be solved with #2449.

checked #2449, it is still in draft state when will the fix be merged? I'd like to try it immediately.

This is almost impossible to predict. But you can try it already now and build CRIU yourself with the patches from #2449 applied to see if it actually solves your problem.

coldbloodx commented 1 month ago

@adrianreber Thanks bro, I'll try it later

github-actions[bot] commented 1 week ago

A friendly reminder that this issue had no activity for 30 days.