lima-vm / lima

Linux virtual machines, with a focus on running containers
https://lima-vm.io/
Apache License 2.0
15.5k stars 608 forks source link

Lima's X11 forwarding breaks after a MacBook goes to sleep #2099

Open ilyagr opened 11 months ago

ilyagr commented 11 months ago

Summary

I'm trying to get Lima to work with XQuartz to run graphical applications. It mostly just works, except when my MacBook goes to sleep and wakes up again. After that, applications inside Lima can no longer seem to connect to X11 with messages like:

xterm: Xt error: Can't open display: localhost:12.0

Cc: #151, #877

My setup

I'm using Sonoma 14.2.1 on an M2 Apple silicon MacBook Pro. I installed XQuartz from its website, it reports XQuartz 2.8.5 (xorg-server 21.1.6). I'm running a Debian container. I added the following lines to limactl edit config file:

ssh:
  forwardX11: true
  forwardAgent: true

I also ran sudo apt install xterm in the container.

Here is some more sysinfo:

# On host system
$ uname -a
Darwin macaw.local 23.2.0 Darwin Kernel Version 23.2.0: Wed Nov 15 21:55:06 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6020 arm64 arm Darwin
$ limactl shell debian uname -a
Linux lima-debian 6.1.0-16-cloud-arm64 #1 SMP Debian 6.1.67-1 (2023-12-12) aarch64 GNU/Linux
$ lima --version
limactl version 0.19.0

This is a QEMU VM.

VM config (`lima.yaml`) ```yaml # This template requires Lima v0.7.0 or later images: # Try to use release-yyyyMMdd image if available. Note that release-yyyyMMdd will be removed after several months. - location: "https://cloud.debian.org/images/cloud/bookworm/20231013-1532/debian-12-genericcloud-amd64-20231013-1532.qcow2" arch: "x86_64" digest: "sha512:6b55e88b027c14da1b55c85a25a9f7069d4560a8fdb2d948c986a585db469728a06d2c528303e34bb62d8b2984def38fd9ddfc00965846ff6e05b01d6e883bfe" - location: "https://cloud.debian.org/images/cloud/bookworm/20231013-1532/debian-12-genericcloud-arm64-20231013-1532.qcow2" arch: "aarch64" digest: "sha512:b3754e8c4b474fad2f0bb6d483158cc8e6661cf481dcd7a8c55cc128acb4cd2d829d4afe024462ae45028f33ab977d69737d820c8f6c56800cc133cdcfb5874d" # Fallback to the latest release image. # Hint: run `limactl prune` to invalidate the cache - location: "https://cloud.debian.org/images/cloud/bookworm/latest/debian-12-genericcloud-amd64.qcow2" arch: "x86_64" - location: "https://cloud.debian.org/images/cloud/bookworm/latest/debian-12-genericcloud-arm64.qcow2" arch: "aarch64" mounts: - location: "~" - location: "/tmp/lima" writable: true ssh: forwardX11: true forwardAgent: true ```

However, I also checked that this affects a VZ VM based on the "default" template in exactly the same way.

The problem

If I have XQuartz running, limactl shell debian xterm works fine at first. However, if I close the lid of the computer and reopen it an hour later, it no longer works. The error message is what I quoted before: xterm: Xt error: Can't open display: localhost:12.0.

Interestingly, the xterm I ran before closing the lid still runs and responds to new input after the lid is reopened. However, I cannot start new graphical applications from it either.

Workaround

Restarting the VM fixes the problem. I haven't yet found a better workaround, please let me know if you know one!

afbjorklund commented 11 months ago

We need a way to restore the SSH connections after sleep, both for the host but also for the guest:

ilyagr commented 10 months ago

Idle thought: This might be overkill, but perhaps Lima could allow connecting to the guest via https://www.wireguard.com/install/ if Wireguard is installed on the host. I'm more familiar with the simpler but non-free Tailscale, so I don't know just how tricky this would be; it would likely have other applications as well.

https://github.com/linuxserver/docker-wireguard could be useful to look at as an example.

AkihiroSuda commented 10 months ago

Idle thought: This might be overkill, but perhaps Lima could allow connecting to the guest via https://www.wireguard.com/install/ if Wireguard is installed on the host. I'm more familiar with the simpler but non-free Tailscale, so I don't know just how tricky this would be; it would likely have other applications as well.

https://github.com/linuxserver/docker-wireguard could be useful to look at as an example.

Was this intended to be posted in another place? Doesn't seem relevant to X11.

ilyagr commented 10 months ago

I was thinking that if the SSH connection (with X forwarding) went via Wireguard, it wouldn't break on sleep AFAIK. I should've mentioned that, sorry for the confusion.

This is based on my experience with Tailscale (which is commercial but freemium). Installing tailscale in the VM and using it for ssh should be smoother workaround for the problem, but I haven't double-checked yet.

jin0g commented 10 months ago

I have been having the exact same problem for months. Is there any update?

jack2gs commented 7 months ago

same issue here. it looks like the SSH sever in vm died after host wakeup sometime. restarting works for me but quite annoying. macos: 14.4.1 (23E224)

when stop the vm, it complains there's no ssh.sock:


INFO[0000] Waiting for the host agent and the driver processes to shut down
INFO[0000] [hostagent] Received SIGINT, shutting down the host agent
INFO[0000] [hostagent] Shutting down the host agent
INFO[0000] [hostagent] Stopping forwarding "/run/lima-guestagent.sock" (guest) to "/Users/jgao/.lima/default/ga.sock" (host)
INFO[0000] [hostagent] Unmounting "/Users/jgao"
INFO[0000] [hostagent] Unmounting "/tmp/lima"
WARN[0000] [hostagent] failed to exit SSH master         error="failed to execute `ssh -O exit -p 60022 127.0.0.1`, out=\"Control socket connect(/Users/jgao/.lima/default/ssh.sock): No such file or directory\\r\\n\": exit status 255"
WARN[0000] [hostagent] an error during shutting down the host agent  error="failed to run [ssh -F /dev/null -o IdentityFile=\"/Users/jgao/.lima/_config/user\" -o IdentityFile=\"/Users/jgao/.ssh/id_rsa\" -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o NoHostAuthenticationForLocalhost=yes -o GSSAPIAuthentication=no -o PreferredAuthentications=publickey -o Compression=no -o BatchMode=yes -o IdentitiesOnly=yes -o Ciphers=\"^aes128-gcm@openssh.com,aes256-gcm@openssh.com\" -o User=jgao -o ControlMaster=auto -o ControlPath=\"/Users/jgao/.lima/default/ssh.sock\" -o ControlPersist=yes -T -O cancel -L /Users/jgao/.lima/default/ga.sock:/run/lima-guestagent.sock -N -f -p 60022 127.0.0.1 --]: \"\": exit status 255"
INFO[0000] [hostagent] Shutting down QEMU with ACPI
INFO[0000] [hostagent] Sending QMP system_powerdown command
FATA[0180] did not receive an event with the "exiting" status```
ilyagr commented 7 months ago

It actually seems to me that this is an X11-specific issue, and not an issue with the SSH connection itself, at least for me.

If I use limactl shell, that connection survives a sleep-resume cycle fine. I believe this connection goes over SSH. The terminal part of the connection still works after the resume. The X11 forwarding only works until the computer goes to sleep, though.

Am I misunderstanding something?

ilyagr commented 6 months ago

Another interesting option for working around this is https://github.com/Xpra-org/xpra or the (seemingly less mature) https://github.com/wayland-transpositor/wprs.

I'll leave another comment if I actually test them.

Update: No luck so far, Xpra crashes on startup on Mac OS Sonoma (Apple Silicon), https://github.com/Xpra-org/xpra/issues/4017#issuecomment-2105506722. Seems to be some sort of GTK issue. WPRS doesn't seem to support Mac OS at all.

afbjorklund commented 6 months ago

Historically there was NX for this type of set up, not sure if anyone uses it since it stopped being Open Source

https://en.wikipedia.org/wiki/NX_technology

ilyagr commented 6 months ago

NX doesn't seem to have any binaries for Apple Silicon, the best I can find is https://downloads.nomachine.com/download/?id=7. That might work via Rosetta, but it makes me suspicious that they haven't updated their site since Apple Silicon appeared in the world. They DO ship binaries for ARM Linux, so maybe the part inside Lima would work. Since they are close-source, other people cannot try to compile better binaries.

To me, NX sounds very similar to VNC or RPD (https://github.com/neutrinolabs/xrdp). Both seem to have good Mac OS clients and open-source Linux clients.

They have a couple of disadvantages: unlike X11 forwarding with XQuartz and unlike Xpra, they show an entire Linux desktop in one window, and every remote app lives in that one window. Also, I found both hard to configure inside Lima (though I'm sure it's possible with enough effort).

ilyagr commented 2 weeks ago

Since recently, I had to install xauth explicitly inside the VM as per https://github.com/lima-vm/lima/issues/151#issuecomment-1652518725 for X11 forwarding to work.