zellij-org / zellij

A terminal workspace with batteries included
https://zellij.dev
MIT License
22.06k stars 674 forks source link

Zellij Failing to Launch from Home Directory #2369

Closed RPG-Alex closed 1 year ago

RPG-Alex commented 1 year ago

Picture of Issue: image

Responses to checklist:

  1. Delete the contents of /tmp/zellij-1000/zellij-log, ie with cd /tmp/zellij-1000/ and rm -fr zellij-log/ (/tmp/ is $TMPDIR/ on OSX) --Done
  2. Run zellij --debug --Done: From home: image From not home: --runs zellij successfully
  3. Run stty size, copy the result and attach it in the bug report 48 95
  4. Recreate your issue. From my shell (nu shell) I attempt to launch zellij. I do so by typing "zellij". Instead of launching I get the error shown above, I'll paste it here too:

    Error occurred in server:
    
    × Thread 'async-std/runtime' panicked.
    ├─▶ At library/std/src/sys/unix/time.rs:69:9
    ╰─▶ assertion failed: tv_nsec >= 0 && tv_nsec < NSEC_PER_SEC as i64
    help: If you are seeing this message, it means that something went wrong.
    
        -> To get additional information, check the log at: /tmp/zellij-1000/zellij-log/zellij.log
        -> To see a backtrace next time, reproduce the error with: RUST_BACKTRACE=1 zellij [...]
        -> To help us fix this, please open an issue: https://github.com/zellij-org/zellij/issues

    Following instructions and trying backtrace:

    
    ~> RUST_BACKTRACE=1 zellij                                               04/16/2023 11:59:44 AM

Error occurred in server:

× Thread 'async-std/runtime' panicked. ├─▶ At library/std/src/sys/unix/time.rs:69:9 ╰─▶ assertion failed: tv_nsec >= 0 && tv_nsec < NSEC_PER_SEC as i64

  Panic backtrace:
     0: <unknown>
     1: <unknown>
     2: <unknown>
     3: <unknown>
     4: <unknown>
     5: <unknown>
     6: <unknown>
     7: <unknown>
     8: <unknown>
     9: <unknown>
    10: <unknown>
    11: <unknown>
    12: <unknown>
    13: <unknown>
    14: <unknown>
    15: <unknown>
    16: <unknown>
    17: <unknown>
    18: <unknown>
    19: <unknown>

help: If you are seeing this message, it means that something went wrong.

    -> To get additional information, check the log at: /tmp/zellij-1000/zellij-log/zellij.log
    -> To see a backtrace next time, reproduce the error with: RUST_BACKTRACE=1 zellij [...]
    -> To help us fix this, please open an issue: https://github.com/zellij-org/zellij/issues
5. Quit Zellij immediately with ctrl-q (your bug should ideally still be visible on screen)
[zellij.log](https://github.com/zellij-org/zellij/files/11241187/zellij.log)
Please attach the files that were created in `/tmp/zellij-1000/zellij-log/` to the extent you are comfortable with.
--Done

**Basic information**

`zellij --version`:
**`zellij 0.36.0`**
`stty size`:
`48 95`
`uname -av` or `ver`(Windows):
`Linux quick-steel 6.1.24-1-lts #1 SMP PREEMPT_DYNAMIC Thu, 13 Apr 2023 17:22:35 +0000 x86_64 GNU/Linux`
List of programs you interact with as, `PROGRAM --version`: output cropped meaningful, for example:

**NuShell:**

~> nu --version 04/16/2023 12:06:16 PM 0.78.0

**Wezterm:**

~> wezterm --version 04/16/2023 12:06:21 PM wezterm 20230408-112425-69ae8472


**Further information**
I use Arch and installed zellij from the pacman repo:

community/zellij 0.36.0-1 [installed] A terminal multiplexer



Again, the application works find if I navigate to any other folder other than HOME. I saw a similar issue here: https://github.com/zellij-org/zellij/issues/322 but was not able to use this to resolve my issue. Not sure if bug or local environment issue. 
raphCode commented 1 year ago

Thanks for the report! I haven't looked into the details yet, but I remember similar unresolved issues in the past: https://github.com/zellij-org/zellij/issues/2357 https://github.com/zellij-org/zellij/issues/2123

Sadly we can't reproduce the problem yet. You seem to have one of the rare setups which reliably triggers the bug. Can you please help us pinpoint the issue?

RPG-Alex commented 1 year ago

Thanks for the report! I haven't looked into the details yet, but I remember similar unresolved issues in the past: #2357 #2123

Sadly we can't reproduce the problem yet. You seem to have one of the rare setups which reliably triggers the bug. Can you please help us pinpoint the issue?

* A debug build could yield a detailed stacktrace.
  With rust installed, cloning the repo and executing `cargo xtask build` should place a binary in `target/debug/zellij` which you can then execute in your homedir

* Try adding a new system user with an empty home directory. Does it happen there?
  To add a new user from the terminal, look at `useradd`

* Try a different system user?

Apologies, hadn't realized replying would close the comment. Will reopen but I will absolutely do this later this week. Thanks for the prompt response. Happy to contribute any way I can!

tlinford commented 1 year ago

looks like this might be related: https://github.com/rust-lang/rust/issues/108277.

Any chance you are on btrfs @RPG-Alex?

tlinford commented 1 year ago

Also, not sure if it was missed, one of the other reports does include a stacktrace: https://github.com/zellij-org/zellij/issues/2123#issuecomment-1418926757 Looking at that it seems to originate around here (the code has moved recently I think) : https://github.com/zellij-org/zellij/blob/d385c73e045984e18879a34f1110d5e1fe3d46b8/zellij-server/src/plugins/plugin_loader.rs#L477-L491

RPG-Alex commented 1 year ago

Ok! Finally had time to sit down here.

So, I have followed your steps here:

Thanks for the report! I haven't looked into the details yet, but I remember similar unresolved issues in the past: #2357 #2123

Sadly we can't reproduce the problem yet. You seem to have one of the rare setups which reliably triggers the bug. Can you please help us pinpoint the issue?

* A debug build could yield a detailed stacktrace.
  With rust installed, cloning the repo and executing `cargo xtask build` should place a binary in `target/debug/zellij` which you can then execute in your homedir

* Try adding a new system user with an empty home directory. Does it happen there?
  To add a new user from the terminal, look at `useradd`

* Try a different system user?

I made a new user, threw the compiled binary into its home dir and executed it. It was able to load zellij, though it was wonky and has been stuck in loop saying its: Loading status-bar... Attempting to load from memory... NOT FOUND Attempting to load from cache... NOT FOUND Compiling WASM...

It appears to be the WASM compilation that is hanging. That doesn't seem related and the app appeared to otherwise work fine, getting past a panic, so yeah, thats weird. Whats zellij not liking about my home directory?

To answer this question:

looks like this might be related: rust-lang/rust#108277.

Any chance you are on btrfs @RPG-Alex?

Indeed! So.... From that link, some file somewhere has a timestamp that really really angers zellij? Not sure what I can do about that or why thats btrfs specific even. What's next? Looking a the code base, I really feel like:

Also, not sure if it was missed, one of the other reports does include a stacktrace: #2123 (comment) Looking at that it seems to originate around here (the code has moved recently I think) :

https://github.com/zellij-org/zellij/blob/d385c73e045984e18879a34f1110d5e1fe3d46b8/zellij-server/src/plugins/plugin_loader.rs#L477-L491

This should be taking us back to: https://github.com/zellij-org/zellij/blob/d385c73e045984e18879a34f1110d5e1fe3d46b8/zellij-utils/src/consts.rs and here: lazy_static! { static ref UID: Uid = Uid::current(); pub static ref ZELLIJ_IPC_PIPE: PathBuf = { let mut sock_dir = ZELLIJ_SOCK_DIR.clone(); fs::create_dir_all(&sock_dir).unwrap(); set_permissions(&sock_dir, 0o700).unwrap(); sock_dir.push(envs::get_session_name().unwrap()); sock_dir }; pub static ref ZELLIJ_TMP_DIR: PathBuf = temp_dir().join(format!("zellij-{}", *UID)); pub static ref ZELLIJ_TMP_LOG_DIR: PathBuf = ZELLIJ_TMP_DIR.join("zellij-log"); pub static ref ZELLIJ_TMP_LOG_FILE: PathBuf = ZELLIJ_TMP_LOG_DIR.join("zellij.log"); pub static ref ZELLIJ_SOCK_DIR: PathBuf = { let mut ipc_dir = envs::get_socket_dir().map_or_else( |_| { ZELLIJ_PROJ_DIR .runtime_dir() .map_or_else(|| ZELLIJ_TMP_DIR.clone(), |p| p.to_owned()) }, PathBuf::from, ); ipc_dir.push(VERSION); ipc_dir }; }

And this specific line: pub static ref ZELLIJ_TMP_DIR: PathBuf = temp_dir().join(format!("zellij-{}", *UID));

which gets me looking at UID... which lead me to nix, https://github.com/nix-rust/nix , and specifically here: https://github.com/nix-rust/nix/blob/master/src/unistd.rs

So is it possibly related to UID as well? I really don't know. If its the timestamp out of range, what file is it thats out of range? Really scratching my head.

tlinford commented 1 year ago

Hey thanks for all the info @RPG-Alex. Could you also please paste the output of stat $HOME (for the user where it crashes)? I suspect the problem is in the finalize() call, but need to see what stat says.

RPG-Alex commented 1 year ago

Hey thanks for all the info @RPG-Alex. Could you also please paste the output of stat $HOME (for the user where it crashes)? I suspect the problem is in the finalize() call, but need to see what stat says.

Absolutely:

 File: /home/alex
  Size: 1276            Blocks: 0          IO Block: 4096   directory
Device: 0,40    Inode: 257         Links: 1
Access: (0700/drwx------)  Uid: ( 1000/    alex)   Gid: ( 1000/    alex)
Access: 2023-04-23 11:51:33.698055345 +0800
Modify: 2023-04-23 11:51:31.311359960 +0800
Change: 2023-04-23 11:51:31.311359960 +0800
 Birth: 8028897137911751214.1953853287
tlinford commented 1 year ago

Thanks - it does look indeed to be https://github.com/rust-lang/rust/issues/108277.

To confirm it 100%, could you try running something like this?

fn main() {
    dbg!(std::fs::metadata("/home/alex"));
}
RPG-Alex commented 1 year ago

Thanks - it does look indeed to be https://github.com/rust-lang/rust/issues/108277.

To confirm it 100%, could you try running something like this?

fn main() {
    dbg!(std::fs::metadata("/home/alex"));
}

Done and done, you seem to be correct:

[test.rs:2] std::fs::metadata("/home/alex") = Ok(
    Metadata {
        file_type: FileType(
            FileType {
                mode: 16832,
            },
        ),
        is_dir: true,
        is_file: false,
        permissions: Permissions(
            FilePermissions {
                mode: 16832,
            },
        ),
        modified: Ok(
            SystemTime {
                tv_sec: 1682516209,
                tv_nsec: 801940970,
            },
        ),
        accessed: Ok(
            SystemTime {
                tv_sec: 1682516208,
                tv_nsec: 981972677,
            },
        ),
thread 'main' panicked at 'assertion failed: tv_nsec >= 0 && tv_nsec < NSEC_PER_SEC as i64', library/std/src/sys/unix/time.rs:69:9
stack backtrace:
   0: rust_begin_unwind
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/panicking.rs:64:14
   2: core::panicking::panic
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/panicking.rs:114:5
   3: std::sys::unix::time::Timespec::new
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/sys/unix/time.rs:69:9
   4: std::sys::unix::time::SystemTime::new
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/sys/unix/time.rs:32:25
   5: std::sys::unix::fs::FileAttr::created
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/sys/unix/fs.rs:509:24
   6: std::fs::Metadata::created
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/fs.rs:1313:9
   7: <std::fs::Metadata as core::fmt::Debug>::fmt
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/fs.rs:1327:32
   8: <&T as core::fmt::Debug>::fmt
   9: core::fmt::builders::DebugTuple::field::{{closure}}
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/fmt/builders.rs:317:17
  10: core::result::Result<T,E>::and_then
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/result.rs:1371:22
  11: core::fmt::builders::DebugTuple::field
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/fmt/builders.rs:309:23
  12: core::fmt::Formatter::debug_tuple_field1_finish
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/fmt/mod.rs:2144:9
  13: <core::result::Result<T,E> as core::fmt::Debug>::fmt
  14: <&T as core::fmt::Debug>::fmt
  15: core::fmt::run
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/fmt/mod.rs:1261:5
  16: core::fmt::write
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/fmt/mod.rs:1229:26
  17: std::io::Write::write_fmt
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/io/mod.rs:1682:15
  18: <&std::io::stdio::Stderr as std::io::Write>::write_fmt
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/io/stdio.rs:934:9
  19: <std::io::stdio::Stderr as std::io::Write>::write_fmt
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/io/stdio.rs:908:9
  20: std::io::stdio::print_to
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/io/stdio.rs:1007:21
  21: std::io::stdio::_eprint
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/io/stdio.rs:1085:5
  22: test::main
  23: core::ops::function::FnOnce::call_once
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

So does this mean its the whole directory itself causing this? I'm still not sure how to proceed

tlinford commented 1 year ago

Thanks for confirming it! For some reason the creation time on your home dir is invalid, and that std::fs::metadata call panics on this.

I'm not so sure if anything can be done to "fix" the bad metadata, the only thing that comes to mind really is moving all files to a new home directory.

Maybe the simplest thing is just to start from another directory until we find a way to handle this?

@imsnif what do you think about this? should we try and work around it with catch_unwind? Also only mapping the cwd when necessary (strider i think), seems like a good idea?

RPG-Alex commented 1 year ago

Ok will close this as that is indeed a work around.Thanks for the help

imsnif commented 1 year ago

@imsnif what do you think about this? should we try and work around it with catch_unwind? Also only mapping the cwd when necessary (strider i think), seems like a good idea?

I think the only thing we can do is indeed make the home folder scanning an explicit request by plugins who need it, and then instead of crashing the whole app we will only crash the plugin (with catch_unwind).

Hopefully this will be fixed in rustc to return an error and then we can handle it much more gracefully.

raphCode commented 1 year ago

It seems that merely accessing the file metadata is not a problem, but converting the invalid timestamp to std::sys::unix::time::Timespec causes the panic? At least I read that from the minimal repro which prints the debug representation fine halfways, and then crashes. Can we somehow not access creation dates so no parsing to a Timespec happens?


Also, should we close the other issue too about the same problem? (#2357) edit: In the other issue, alerque advocated for keeping at least one tracking issue open, which makes sense I think.