zellij-org / zellij

A terminal workspace with batteries included
https://zellij.dev
MIT License
21.09k stars 643 forks source link

zellij doesn't start in wezterm on M1 mac (ventura) - socket path should not be longer than 104 bytes #2817

Open znd4 opened 1 year ago

znd4 commented 1 year ago

Thank you for taking the time to file this issue! Please follow the instructions and fill in the missing parts below the instructions, if it is meaningful. Try to be brief and concise.

In Case of Graphical or Performance Issues

  1. Delete the contents of /tmp/zellij-1000/zellij-log, ie with cd /tmp/zellij-1000/ and rm -fr zellij-log/ (/tmp/ is $TMPDIR/ on OSX)
  2. Run zellij --debug
  3. Run stty size, copy the result and attach it in the bug report
  4. Recreate your issue.
  5. Quit Zellij immediately with ctrl-q (your bug should ideally still be visible on screen)

Please attach the files that were created in /tmp/zellij-1000/zellij-log/ to the extent you are comfortable with.

Basic information

zellij --version: zellij 0.38.2 stty size: 58 101

zellij --debug log: zellij.log

Crucially:

ERROR  |zellij_utils::errors::not| 2023-09-27 07:08:43.525 [server_listener] [zellij-utils/src/errors.rs:614]: Panic occured:
             thread: server_listener
             location: At zellij-server/src/lib.rs:285:73
             message: called `Result::unwrap()` on an `Err` value: Custom { kind: InvalidInput, error: "socket path should not be longer than 104 bytes" } 

uname -av or ver(Windows): Darwin VWGOAAHL333260 22.6.0 Darwin Kernel Version 22.6.0: Wed Jul 5 22:21:53 PDT 2023; root:xnu-8796.141.3~6/RELEASE_ARM64_T6020 arm64

List of programs you interact with as, PROGRAM --version: output cropped meaningful, for example:

wezterm --version: wezterm 20230712-072601-f4abf8fd

Further information

zellij seems to run fine on Terminal, iterm, and kitty

FWIW, I can't ctrl-q -- I'm stuck with no output:

❯ zellij --debug
imsnif commented 1 year ago

Hey, sorry for the trouble. Does this work if you killall -9 zellij?

znd4 commented 1 year ago

Ooh, no that doesn't seem to fix it, but I have managed to narrow the issue somewhat --

It seems like this issue only seems to reproduce when I've opened WezTerm from a shortcut (from MacOS's Shortcuts app)

Here's a screenshot for posterity, in case someone else somehow stumbles onto this issue:

image

Funnily enough though, when I zsh --login -c "open -a Wezterm" directly in Terminal, I'm able to use zellij just fine.

With that in mind, I'm tempted to think that this issue should be closed -- if I really want to keep opening wezterm from within a zsh --login session (makes it convenient for Wezterm to have all of the right PATH inclusions etc.), I'm sure I can find a convenient workaround that dodges whatever weirdness the Shortcuts app is doing

imsnif commented 1 year ago

Nono - we should support this. I don't know what's going on here, but I'm suspicious of this hard-coded socket path limitation... @jaeheonji - if I'm not mistaken I think you have a little more context on this verification. Do you have any ideas?

jaeheonji commented 1 year ago

@imsnif I think it's related in #2591 and #2607. Could it be that limiting the socket path is a problem? I thought this was what was discussed 😳

imsnif commented 1 year ago

@jaeheonji - I'm just wondering where the number 108 comes from...

imsnif commented 1 year ago

(just to make sure: no blame or anything! This isn't anyone's fault, just trying to understand)

jaeheonji commented 1 year ago

I think I misunderstood the history and reviewed it. I guess that whoever created the PR saw the number 108 you mentioned and hard-coded it. However, this is a problem because it is not a value considering a system such as Darwin. I think it's my mistake. I was only focusing the function and logic, so I missed about the issue history.

imsnif commented 1 year ago

No worries @jaeheonji - it happens! Do you know where we can get this number dynamically so it will fit all systems?

jaeheonji commented 1 year ago

I checked out some documentation and the raw code of the Rust Standard library.

First, I found good answers about this topic in stack overflow. Most UNIX-based systems (including macOS) seem to have 104 or 108 bytes as the magic number.

Then, I was able to check the socket_path magic number defined for each system in rust-lang/libc.

https://github.com/rust-lang/rust/blob/aeaa5c30e5c9041264a2e8314b68ad84c2dc3169/library/std/src/os/unix/net/stream.rs#L93-L102

This code internally call libc::sockaddr_un and check the socket_path length.

pub(super) fn sockaddr_un(path: &Path) -> io::Result<(libc::sockaddr_un, libc::socklen_t)> {
    // SAFETY: All zeros is a valid representation for `sockaddr_un`.
    let mut addr: libc::sockaddr_un = unsafe { mem::zeroed() };
    addr.sun_family = libc::AF_UNIX as libc::sa_family_t;

    let bytes = path.as_os_str().as_bytes();

    if bytes.contains(&0) {
        return Err(io::const_io_error!(
            io::ErrorKind::InvalidInput,
            "paths must not contain interior null bytes",
        ));
    }

    // HERE
    if bytes.len() >= addr.sun_path.len() {
        return Err(io::const_io_error!(
            io::ErrorKind::InvalidInput,
            "path must be shorter than SUN_LEN",
        ));
    }

    ...
}

And, socket_path can be checked in the libc package as the sun_path variable.

// libc/src/unix/bsd/mod.rs
pub struct sockaddr_un {
    pub sun_len: u8,
    pub sun_family: sa_family_t,
    pub sun_path: [c_char; 104]  // THIS
}

// libc/src/unix/linux_like/mod.rs
pub struct sockaddr_un {
    pub sun_family: sa_family_t,
    pub sun_path: [::c_char; 108] // THIS
}

So, I think we can dynamically get the maximum socket_path value for each system by calling sun_path.len() using libc. Do you think this makes sense?

imsnif commented 1 year ago

Yes @jaeheonji - this looks good, thanks for addressing this so fast! I hope this did not stress you too much, I'm not sure why we encountered this here, but this seems very much like an edge case - so I don't think the fix is very urgent.

Please make sure to put this under the os interface (I think it's called OsInputOutput but I don't have it in front of me) so that we can mock this up in tests and make it work for other OSs if we need to. If it fails to get the info from libc for some reason, it should not crash and return a None instead, and then we'll do what we did up until now. Makes sense?

jaeheonji commented 1 year ago

@imsnif Great! then If it's not urgent now, can we handle this problem after doing what we originally wanted to do (plugin web)?

imsnif commented 1 year ago

Yes, definitely! I'll assign this issue for you and let's take care of it after the plugin web.