Closed RPG-Alex closed 1 year ago
Thanks for the report! I haven't looked into the details yet, but I remember similar unresolved issues in the past: https://github.com/zellij-org/zellij/issues/2357 https://github.com/zellij-org/zellij/issues/2123
Sadly we can't reproduce the problem yet. You seem to have one of the rare setups which reliably triggers the bug. Can you please help us pinpoint the issue?
cargo xtask build
should place a binary in target/debug/zellij
which you can then execute in your homediruseradd
Thanks for the report! I haven't looked into the details yet, but I remember similar unresolved issues in the past: #2357 #2123
Sadly we can't reproduce the problem yet. You seem to have one of the rare setups which reliably triggers the bug. Can you please help us pinpoint the issue?
* A debug build could yield a detailed stacktrace. With rust installed, cloning the repo and executing `cargo xtask build` should place a binary in `target/debug/zellij` which you can then execute in your homedir * Try adding a new system user with an empty home directory. Does it happen there? To add a new user from the terminal, look at `useradd` * Try a different system user?
Apologies, hadn't realized replying would close the comment. Will reopen but I will absolutely do this later this week. Thanks for the prompt response. Happy to contribute any way I can!
looks like this might be related: https://github.com/rust-lang/rust/issues/108277.
Any chance you are on btrfs @RPG-Alex?
Also, not sure if it was missed, one of the other reports does include a stacktrace: https://github.com/zellij-org/zellij/issues/2123#issuecomment-1418926757 Looking at that it seems to originate around here (the code has moved recently I think) : https://github.com/zellij-org/zellij/blob/d385c73e045984e18879a34f1110d5e1fe3d46b8/zellij-server/src/plugins/plugin_loader.rs#L477-L491
Ok! Finally had time to sit down here.
So, I have followed your steps here:
Thanks for the report! I haven't looked into the details yet, but I remember similar unresolved issues in the past: #2357 #2123
Sadly we can't reproduce the problem yet. You seem to have one of the rare setups which reliably triggers the bug. Can you please help us pinpoint the issue?
* A debug build could yield a detailed stacktrace. With rust installed, cloning the repo and executing `cargo xtask build` should place a binary in `target/debug/zellij` which you can then execute in your homedir * Try adding a new system user with an empty home directory. Does it happen there? To add a new user from the terminal, look at `useradd` * Try a different system user?
I made a new user, threw the compiled binary into its home dir and executed it. It was able to load zellij, though it was wonky and has been stuck in loop saying its:
Loading status-bar... Attempting to load from memory... NOT FOUND Attempting to load from cache... NOT FOUND Compiling WASM...
It appears to be the WASM compilation that is hanging. That doesn't seem related and the app appeared to otherwise work fine, getting past a panic, so yeah, thats weird. Whats zellij not liking about my home directory?
To answer this question:
looks like this might be related: rust-lang/rust#108277.
Any chance you are on btrfs @RPG-Alex?
Indeed! So.... From that link, some file somewhere has a timestamp that really really angers zellij? Not sure what I can do about that or why thats btrfs specific even. What's next? Looking a the code base, I really feel like:
Also, not sure if it was missed, one of the other reports does include a stacktrace: #2123 (comment) Looking at that it seems to originate around here (the code has moved recently I think) :
This should be taking us back to: https://github.com/zellij-org/zellij/blob/d385c73e045984e18879a34f1110d5e1fe3d46b8/zellij-utils/src/consts.rs and here:
lazy_static! { static ref UID: Uid = Uid::current(); pub static ref ZELLIJ_IPC_PIPE: PathBuf = { let mut sock_dir = ZELLIJ_SOCK_DIR.clone(); fs::create_dir_all(&sock_dir).unwrap(); set_permissions(&sock_dir, 0o700).unwrap(); sock_dir.push(envs::get_session_name().unwrap()); sock_dir }; pub static ref ZELLIJ_TMP_DIR: PathBuf = temp_dir().join(format!("zellij-{}", *UID)); pub static ref ZELLIJ_TMP_LOG_DIR: PathBuf = ZELLIJ_TMP_DIR.join("zellij-log"); pub static ref ZELLIJ_TMP_LOG_FILE: PathBuf = ZELLIJ_TMP_LOG_DIR.join("zellij.log"); pub static ref ZELLIJ_SOCK_DIR: PathBuf = { let mut ipc_dir = envs::get_socket_dir().map_or_else( |_| { ZELLIJ_PROJ_DIR .runtime_dir() .map_or_else(|| ZELLIJ_TMP_DIR.clone(), |p| p.to_owned()) }, PathBuf::from, ); ipc_dir.push(VERSION); ipc_dir }; }
And this specific line:
pub static ref ZELLIJ_TMP_DIR: PathBuf = temp_dir().join(format!("zellij-{}", *UID));
which gets me looking at UID... which lead me to nix, https://github.com/nix-rust/nix , and specifically here: https://github.com/nix-rust/nix/blob/master/src/unistd.rs
So is it possibly related to UID as well? I really don't know. If its the timestamp out of range, what file is it thats out of range? Really scratching my head.
Hey thanks for all the info @RPG-Alex. Could you also please paste the output of stat $HOME
(for the user where it crashes)?
I suspect the problem is in the finalize() call, but need to see what stat says.
Hey thanks for all the info @RPG-Alex. Could you also please paste the output of
stat $HOME
(for the user where it crashes)? I suspect the problem is in the finalize() call, but need to see what stat says.
Absolutely:
File: /home/alex
Size: 1276 Blocks: 0 IO Block: 4096 directory
Device: 0,40 Inode: 257 Links: 1
Access: (0700/drwx------) Uid: ( 1000/ alex) Gid: ( 1000/ alex)
Access: 2023-04-23 11:51:33.698055345 +0800
Modify: 2023-04-23 11:51:31.311359960 +0800
Change: 2023-04-23 11:51:31.311359960 +0800
Birth: 8028897137911751214.1953853287
Thanks - it does look indeed to be https://github.com/rust-lang/rust/issues/108277
.
To confirm it 100%, could you try running something like this?
fn main() {
dbg!(std::fs::metadata("/home/alex"));
}
Thanks - it does look indeed to be
https://github.com/rust-lang/rust/issues/108277
.To confirm it 100%, could you try running something like this?
fn main() { dbg!(std::fs::metadata("/home/alex")); }
Done and done, you seem to be correct:
[test.rs:2] std::fs::metadata("/home/alex") = Ok(
Metadata {
file_type: FileType(
FileType {
mode: 16832,
},
),
is_dir: true,
is_file: false,
permissions: Permissions(
FilePermissions {
mode: 16832,
},
),
modified: Ok(
SystemTime {
tv_sec: 1682516209,
tv_nsec: 801940970,
},
),
accessed: Ok(
SystemTime {
tv_sec: 1682516208,
tv_nsec: 981972677,
},
),
thread 'main' panicked at 'assertion failed: tv_nsec >= 0 && tv_nsec < NSEC_PER_SEC as i64', library/std/src/sys/unix/time.rs:69:9
stack backtrace:
0: rust_begin_unwind
at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/panicking.rs:575:5
1: core::panicking::panic_fmt
at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/panicking.rs:64:14
2: core::panicking::panic
at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/panicking.rs:114:5
3: std::sys::unix::time::Timespec::new
at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/sys/unix/time.rs:69:9
4: std::sys::unix::time::SystemTime::new
at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/sys/unix/time.rs:32:25
5: std::sys::unix::fs::FileAttr::created
at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/sys/unix/fs.rs:509:24
6: std::fs::Metadata::created
at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/fs.rs:1313:9
7: <std::fs::Metadata as core::fmt::Debug>::fmt
at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/fs.rs:1327:32
8: <&T as core::fmt::Debug>::fmt
9: core::fmt::builders::DebugTuple::field::{{closure}}
at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/fmt/builders.rs:317:17
10: core::result::Result<T,E>::and_then
at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/result.rs:1371:22
11: core::fmt::builders::DebugTuple::field
at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/fmt/builders.rs:309:23
12: core::fmt::Formatter::debug_tuple_field1_finish
at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/fmt/mod.rs:2144:9
13: <core::result::Result<T,E> as core::fmt::Debug>::fmt
14: <&T as core::fmt::Debug>::fmt
15: core::fmt::run
at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/fmt/mod.rs:1261:5
16: core::fmt::write
at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/fmt/mod.rs:1229:26
17: std::io::Write::write_fmt
at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/io/mod.rs:1682:15
18: <&std::io::stdio::Stderr as std::io::Write>::write_fmt
at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/io/stdio.rs:934:9
19: <std::io::stdio::Stderr as std::io::Write>::write_fmt
at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/io/stdio.rs:908:9
20: std::io::stdio::print_to
at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/io/stdio.rs:1007:21
21: std::io::stdio::_eprint
at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/io/stdio.rs:1085:5
22: test::main
23: core::ops::function::FnOnce::call_once
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
So does this mean its the whole directory itself causing this? I'm still not sure how to proceed
Thanks for confirming it! For some reason the creation time on your home dir is invalid, and that std::fs::metadata call panics on this.
I'm not so sure if anything can be done to "fix" the bad metadata, the only thing that comes to mind really is moving all files to a new home directory.
Maybe the simplest thing is just to start from another directory until we find a way to handle this?
@imsnif what do you think about this? should we try and work around it with catch_unwind? Also only mapping the cwd when necessary (strider i think), seems like a good idea?
Ok will close this as that is indeed a work around.Thanks for the help
@imsnif what do you think about this? should we try and work around it with catch_unwind? Also only mapping the cwd when necessary (strider i think), seems like a good idea?
I think the only thing we can do is indeed make the home folder scanning an explicit request by plugins who need it, and then instead of crashing the whole app we will only crash the plugin (with catch_unwind).
Hopefully this will be fixed in rustc to return an error and then we can handle it much more gracefully.
It seems that merely accessing the file metadata is not a problem, but converting the invalid timestamp to std::sys::unix::time::Timespec
causes the panic?
At least I read that from the minimal repro which prints the debug representation fine halfways, and then crashes.
Can we somehow not access creation dates so no parsing to a Timespec happens?
Also, should we close the other issue too about the same problem? (#2357) edit: In the other issue, alerque advocated for keeping at least one tracking issue open, which makes sense I think.
Picture of Issue:
Responses to checklist:
/tmp/zellij-1000/zellij-log
, ie withcd /tmp/zellij-1000/
andrm -fr zellij-log/
(/tmp/
is$TMPDIR/
on OSX) --Donezellij --debug
--Done: From home: From not home: --runs zellij successfullystty size
, copy the result and attach it in the bug report48 95
Recreate your issue. From my shell (nu shell) I attempt to launch zellij. I do so by typing "zellij". Instead of launching I get the error shown above, I'll paste it here too:
Following instructions and trying backtrace:
Error occurred in server:
× Thread 'async-std/runtime' panicked. ├─▶ At library/std/src/sys/unix/time.rs:69:9 ╰─▶ assertion failed: tv_nsec >= 0 && tv_nsec < NSEC_PER_SEC as i64
help: If you are seeing this message, it means that something went wrong.
~> nu --version 04/16/2023 12:06:16 PM 0.78.0
~> wezterm --version 04/16/2023 12:06:21 PM wezterm 20230408-112425-69ae8472
community/zellij 0.36.0-1 [installed] A terminal multiplexer