wez / wezterm

A GPU-accelerated cross-platform terminal emulator and multiplexer written by @wez and implemented in Rust
https://wezfurlong.org/wezterm/
Other
16.78k stars 750 forks source link

mux server silently fails when no space is left on device #1839

Open ahupp opened 2 years ago

ahupp commented 2 years ago

What Operating System(s) are you seeing this problem on?

Windows, Linux Wayland

WezTerm version

20220407-215528-d356b72c

Did you try the latest nightly build to see if the issue is better (or worse!) than your current version?

Yes, and I updated the version box above to show the version of the nightly that I tried

Describe the bug

I'm setting up an ssh domain with wezterm. The client is Windows 11, the server is Ubuntu 21, both running the same version of wezterm (same behavior with stable and nightly).

On connect I get this error:

Connecting to the-rack-wsl using SSH
Using libssh-rs to connect to adam@the-rack-wsl:22
SSH-2.0-OpenSSH_8.4p1 Ubuntu-6ubuntu2.1
Running: wezterm cli proxy
Checking server version
Please install the same version of wezterm on both the client and server! The server reported error 'Error while decoding response pdu: decoding a PDU: reading PDU length: EOF while reading leb128 encoded value' while being asked for its version.  This likely means that the server is older than the client.
....

The underlying cause is that wezterm-mux-server is failing silently on startup due to strange ENOSPC error writing to /run.

If I run wezterm-mux-server manually (without --daemonize) it works fine. See logs section for more details.

To Reproduce

No response

Configuration

This is the config I added for the ssh_domain:

  ssh_domains = {
    {
      name = "the-rack",
      remote_address = "the-rack-wsl",
      username = "adam",
    }
  },

Expected Behavior

No response

Logs

Client log:

C:\Users\adam>wezterm
06:51:42.442  ←[32mINFO  ←[0m ←[1mwezterm_mux_server_impl::local←[0m > setting up C:\Users\adam\.local/share/wezterm\gui-sock-15900
06:51:42.444  ←[32mINFO  ←[0m ←[1mwezterm_client::discovery::windows←[0m > published gui path as gui-sock-15900
06:51:42.472  ←[33mWARN  ←[0m ←[1mwindow::os::windows::window       ←[0m > EGL init failed Config says to avoid EGL, fall back to WGL
06:51:42.601  ←[32mINFO  ←[0m ←[1mwezterm_gui::termwindow           ←[0m > OpenGL initialized! Intel(R) Iris(R) Xe Graphics 4.5.0 - Build 30.0.101.1340 is_context_loss_possible=true wezterm version: 20220407-215528-d356b72c
06:51:46.586  ←[33mWARN  ←[0m ←[1mwindow::os::windows::window       ←[0m > EGL init failed Config says to avoid EGL, fall back to WGL
06:51:46.641  ←[32mINFO  ←[0m ←[1mwezterm_gui::termwindow           ←[0m > OpenGL initialized! Intel(R) Iris(R) Xe Graphics 4.5.0 - Build 30.0.101.1340 is_context_loss_possible=true wezterm version: 20220407-215528-d356b72c
06:51:49.146  ←[31mERROR ←[0m ←[1mwezterm_client::client            ←[0m > going to run wezterm cli proxy
06:51:50.025  ←[31mERROR ←[0m ←[1mwezterm_client::client            ←[0m > ssh stderr: 06:51:13.550  WARN   wezterm_client::client > While connecting to Socket("/run/user/1000/wezterm/sock"): connecting to /run/user/1000/wezterm/sock.  Will try spawning the server.
06:51:13.550  WARN   wezterm_client::client > Running: "/usr/bin/wezterm-mux-server" "--daemonize"

06:51:50.677  ←[31mERROR ←[0m ←[1mwezterm_client::client            ←[0m > ssh stderr: 06:51:14.202  ERROR  wezterm                > (after spawning server) failed to connect to Socket("/run/user/1000/wezterm/sock"): connecting to /run/user/1000/wezterm/sock: Connection refused (os error 111); terminating

06:51:50.794  ←[31mERROR ←[0m ←[1mwezterm_client::client            ←[0m > wezterm cli proxy failed
06:51:50.802  ←[31mERROR ←[0m ←[1mwezterm_client::client            ←[0m > Error while decoding response pdu: decoding a PDU: reading PDU length: EOF while reading leb128 encoded value
06:51:50.804  ←[31mERROR ←[0m ←[1mmux::connui                       ←[0m > while running ConnectionUI loop: recv_timeout: channel is empty and disconnected
06:51:50.804  ←[31mERROR ←[0m ←[1mwezterm_client::domain            ←[0m > detached domain 2
06:51:50.807  ←[31mERROR ←[0m ←[1mmux                               ←[0m > domain detached panes: []

The mux server fails silently, but with strace:

[pid 4189751] dup2(5, 2 <unfinished ...>
[pid 4189751] <... dup2 resumed>)       = 2
[pid 4189751] write(3, "4189751", 7 <unfinished ...>
[pid 4189751] <... write resumed>)      = -1 ENOSPC (No space left on device)

And just writing to /run produces ENOSPC:

echo "ahhhhhh" > /run/user/1000/wezterm/test.txt
write: No space left on device

/run is a tmpfs

Anything else?

Ideally wezterm would be clearer when the mux server fails here, by doing more work before backgrounding and reporting any errors synchronously to the client.

wez commented 2 years ago

Hmm, I think it's eprintln! that is likely panicking when the write fails. I've made a speculative change that just ignores that class of error. It's not ideal. Doing something more robust will take a bit of thought!

ahupp commented 2 years ago

IMO if you can't write to /run its better to just panic; lots of other stuff would be broken in that case too.