zellij-org / zellij

A terminal workspace with batteries included
https://zellij.dev
MIT License
21.99k stars 669 forks source link

bug: parallel downloads of the same plugin via web locations break the plugin #3479

Open dj95 opened 4 months ago

dj95 commented 4 months ago

When a plugin is loaded via http multiple times in parallel, the plugin breaks for zellij. On the first glance, it does not look like a common issue, but it is commonly reproduced with status bar plugins. When a user defines a layout with multiple tabs, on a clean cache zellij will start the download multiple times and corrupts the file.

Basic information

zellij --version: v0.40.1

stty size: 20 140

uname -av or ver(Windows): Linux 57ffe693fca0 6.5.0-35-generic #35-Ubuntu SMP PREEMPT_DYNAMIC Sat Apr 27 00:18:38 UTC 2024 aarch64 GNU/Linux

Excerpt from the zellij log:

ERROR  |zellij_server::plugins::w| 2024-07-06 17:33:31.943 [async-std/runti] [zellij-server/src/plugins/wasm_bridge.rs:1224]: Io(Custom { kind: NotFound, error: VerboseError { source: Os { code: 2, kind: NotFound, message: "No such file or directory" }, message: "could not rename `/root/.cache/zellij/11443609859275959894305566542538288992.part` to `/root/.cache/zellij/11443609859275959894305566542538288992`" } })
ERROR  |zellij_server::plugins::w| 2024-07-06 17:33:31.947 [async-std/runti] [zellij-server/src/plugins/wasm_bridge.rs:1224]: failed to start plugin 1 for client 1

Caused by:
    0: No such file or directory (os error 2): '/root/.local/share/zellij/plugins.wasm'
    1: No such file or directory (os error 2): ''
    2: failed to load plugin from disk

Minimal reproduction

Loading the following layout will show the error line at the bottom. Removing all tabs except one, clearing the cache and loading the layout will work.

layout {
  default_tab_template {
    children
    pane size=1 borderless=true {
      // plugin location="file:./zjstatus.wasm.2" {
      plugin location="https://github.com/dj95/zjstatus/releases/latest/download/zjstatus.wasm" {
        format_left   "{mode} #[fg=#89b4fa,bold]{session}"
        format_center "{tabs}"
        format_right  "{command_git_branch} {datetime}{notifications}"
        format_space  ""
      }
    }
  }

  tab {
    pane split_direction="horizontal" {
      pane size="70%"
      pane
    }
  }
  tab {
    pane split_direction="horizontal" {
      pane size="70%"
      pane
    }
  }
  tab {
    pane split_direction="horizontal" {
      pane size="70%"
      pane
    }
  }
  tab {
    pane split_direction="horizontal" {
      pane size="70%"
      pane
    }
  }
  tab {
    pane split_direction="horizontal" {
      pane size="70%"
      pane
    }
  }
  tab {
    pane split_direction="horizontal" {
      pane size="70%"
      pane
    }
  }
  tab {
    pane split_direction="horizontal" {
      pane size="70%"
      pane
    }
  }
  tab {
    pane split_direction="horizontal" {
      pane size="70%"
      pane
    }
  }
  tab {
    pane split_direction="horizontal" {
      pane size="70%"
      pane
    }
  }
}

Other relevant information

I got several reports of plugins not loading via https, which might be caused by this issue:

https://github.com/dj95/zjstatus/issues/72 https://github.com/dj95/zjstatus/issues/73

imsnif commented 1 month ago

Hey, I did not intend to close this bug but github likes to hijack workflows. So @dj95 - I think this should be fixed by https://github.com/zellij-org/zellij/pull/3664 - would you mind checking and letting me know?

dj95 commented 1 month ago

No worries.

I rebuild zellij in the main branch and tried to start a layout with the https location, but zellij is not starting anymore.

./target/release/zellij -l crash.kdl

Cache is cleared and the layout is the one from the first comment. This is the complete log, that is printed:

ERROR  |zellij::sessions         | 2024-10-11 16:03:07.298 [main      ] [src/sessions.rs:121]: Failed to read session_info cache folder: ""/Users/daniel/Library/Caches/org.Zellij-Contributors.Zellij/0.41.0/session_info"": Os { code: 2, kind: NotFound, message: "No such file or directory" }
ERROR  |zellij::sessions         | 2024-10-11 16:03:07.298 [main      ] [src/sessions.rs:121]: Failed to read session_info cache folder: ""/Users/daniel/Library/Caches/org.Zellij-Contributors.Zellij/0.41.0/session_info"": Os { code: 2, kind: NotFound, message: "No such file or directory" }
INFO   |zellij_client            | 2024-10-11 16:03:07.298 [main      ] [zellij-client/src/lib.rs:182]: Starting Zellij client!
INFO   |zellij_server            | 2024-10-11 16:03:07.311 [main      ] [zellij-server/src/lib.rs:449]: Starting Zellij server!
WARN   |zellij_utils::kdl        | 2024-10-11 16:03:07.314 [main      ] [zellij-utils/src/kdl/mod.rs:695]: Converting new tab action without arguments, original action saved to .bak.kdl file
WARN   |zellij_utils::kdl        | 2024-10-11 16:03:07.314 [main      ] [zellij-utils/src/kdl/mod.rs:695]: Converting new tab action without arguments, original action saved to .bak.kdl file
ERROR  |zellij_client::stdin_ansi| 2024-10-11 16:03:07.315 [stdin_handler] [zellij-client/src/stdin_ansi_parser.rs:124]: Failed to open STDIN cache file: Os { code: 2, kind: NotFound, message: "No such file or directory" }
INFO   |zellij_server            | 2024-10-11 16:03:07.315 [main      ] [zellij-server/src/lib.rs:1378]: Compiling plugins using Cranelift
INFO   |zellij_server::plugins   | 2024-10-11 16:03:07.316 [wasm      ] [zellij-server/src/plugins/mod.rs:231]: Wasm main thread starts
WARN   |zellij_utils::ipc        | 2024-10-11 16:03:07.359 [router    ] [zellij-utils/src/ipc.rs:233]: Error in IpcReceiver.recv(): InvalidMarkerRead(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" })
ERROR  |zellij_client            | 2024-10-11 16:03:07.359 [router    ] [zellij-client/src/lib.rs:414]: Received empty message from server

Maybe I can pinpoint the commit this weekend, at which the download breaks.


Edit: without the http location the layout seems to work.

imsnif commented 1 month ago

Hum... I'm unfortunately not reproducing this. Just a guess: could it be that you compiled Zellij with cargo x build --release or some such? This could mean that the built-in plugins were not compiled for the latest API and might explain the crash.

Could you try with either cargo x run -- -l ./crash.kdl or with cargo x install /path/to/release; /path/to/release -l ./crash.kdl? If you still get the crash I'd be very much interested to dig into this further when you have the time.

Thanks for the super quick response!

dj95 commented 1 month ago

I compiled it with cargo xtask build -r and executed the binary afterwards. Let me try your suggestion tomorrow. Maybe it's some macOS specific thing. But I never got them with building zellij with cargo xtask build before. Just with brew

imsnif commented 1 month ago

Alright, I think I might have found the issue. I couldn't reproduce this exactly, but if I deleted the cache folder and started Zellij with this layout I sometimes got a case in which the plugin itself did not load. The difference in error behavior could be due to platform differences...

I issued a fix for this in https://github.com/zellij-org/zellij/pull/3665 and merged it to main. Could you give it another spin when you have the time?

dj95 commented 1 month ago

Unfortunately, it still crashes. When bisecting the changes, it comes to this commit, that introduces the crashes: https://github.com/zellij-org/zellij/commit/ba2772e31ca062cba5c2b2b881fdb68e0093216e

Not sure where it exactly comes from. The cache just contains a .part file, that's empty. The log does not say anything unfortunately.

It dies at the let res = client.send_async(request).await?;. In case it is interesting for you, this is the stack trace of the crashing thread:

Stack Trace ``` Thread 20 Crashed:: isahc-agent-1 Dispatch queue: com.apple.root.default-qos 0 libdispatch.dylib 0x18b94c1cc _dispatch_apply_with_attr_f + 1224 1 libdispatch.dylib 0x18b94c3cc dispatch_apply + 96 2 CoreFoundation 0x18bd02250 __103-[CFPrefsSearchListSource synchronouslySendSystemMessage:andUserMessage:andDirectMessage:replyHandler:]_block_invoke.49 + 132 3 CoreFoundation 0x18bb86b1c CFPREFERENCES_IS_WAITING_FOR_SYSTEM_AND_USER_CFPREFSDS + 100 4 CoreFoundation 0x18bd0141c -[CFPrefsSearchListSource synchronouslySendSystemMessage:andUserMessage:andDirectMessage:replyHandler:] + 232 5 CoreFoundation 0x18bb84e38 -[CFPrefsSearchListSource alreadylocked_generationCountFromListOfSources:count:] + 232 6 CoreFoundation 0x18bb84b44 -[CFPrefsSearchListSource alreadylocked_getDictionary:] + 476 7 CoreFoundation 0x18bb8469c -[CFPrefsSearchListSource alreadylocked_copyValueForKey:] + 172 8 CoreFoundation 0x18bb845d0 -[CFPrefsSource copyValueForKey:] + 52 9 CoreFoundation 0x18bb84584 __76-[_CFXPreferences copyAppValueForKey:identifier:container:configurationURL:]_block_invoke + 32 10 CoreFoundation 0x18bb7db28 __108-[_CFXPreferences(SearchListAdditions) withSearchListForIdentifier:container:cloudConfigurationURL:perform:]_block_invoke + 376 11 CoreFoundation 0x18bd02b38 -[_CFXPreferences withSearchListForIdentifier:container:cloudConfigurationURL:perform:] + 440 12 CoreFoundation 0x18bb7d49c -[_CFXPreferences copyAppValueForKey:identifier:container:configurationURL:] + 156 13 CoreFoundation 0x18bb7d3c4 _CFPreferencesCopyAppValueWithContainerAndConfiguration + 112 14 Security 0x18ee516a0 __SSLCreateContextWithRecordFuncs_block_invoke + 48 15 libdispatch.dylib 0x18b938658 _dispatch_client_callout + 20 16 libdispatch.dylib 0x18b939ea0 _dispatch_once_callout + 32 17 Security 0x18ee513f0 SSLCreateContextWithRecordFuncs + 416 18 Security 0x18ee51178 SSLCreateContext + 32 19 zellij 0x103017458 sectransp_connect_step1 + 704 20 zellij 0x103016da8 sectransp_connect_common + 224 21 zellij 0x103016604 sectransp_connect_nonblocking + 48 22 zellij 0x103008800 ssl_connect_nonblocking + 80 23 zellij 0x10300694c ssl_cf_connect + 408 24 zellij 0x102f8fe84 Curl_conn_cf_connect + 92 25 zellij 0x102f9ba1c cf_setup_connect + 188 26 zellij 0x102f8fe84 Curl_conn_cf_connect + 92 27 zellij 0x102f94a90 cf_hc_baller_connect + 84 28 zellij 0x102f93cc8 cf_hc_connect + 684 29 zellij 0x102f901a8 Curl_conn_connect + 220 30 zellij 0x102fd64a4 multi_runsingle + 1832 31 zellij 0x102fd878c multi_socket + 796 32 zellij 0x102fd8938 curl_multi_socket_action + 100 33 zellij 0x102f84b14 curl::multi::Multi::action::h48eed0c3ff2b74bc + 76 34 zellij 0x102eb08ac isahc::agent::AgentContext::poll::h14db3d70810fc2a8 + 2160 35 zellij 0x102eaf3f8 isahc::agent::AgentContext::run::h2133858fa3c7ea18 + 252 36 zellij 0x102eab9d0 isahc::agent::AgentBuilder::spawn::_$u7b$$u7b$closure$u7d$$u7d$::h75711808ff97c725 + 2724 37 zellij 0x102f35e94 std::sys_common::backtrace::__rust_begin_short_backtrace::h6d3025b2016886ef + 16 38 zellij 0x102f08634 std::thread::Builder::spawn_unchecked_::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h08cb6349442cf4da + 44 39 zellij 0x102ec10b4 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h354ca23fd86b174d + 48 40 zellij 0x102f2d148 std::panicking::try::do_call::haa7d87006b47389a + 84 41 zellij 0x102f33cb4 __rust_try + 32 42 zellij 0x102f2c73c std::panicking::try::h0193c06f29936c5d + 100 43 zellij 0x102f084c4 std::thread::Builder::spawn_unchecked_::_$u7b$$u7b$closure$u7d$$u7d$::h94c023650557dcf3 + 432 44 zellij 0x102ec5bd0 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hf2dd6be74a859108 + 24 45 zellij 0x1036b908c std::sys::unix::thread::Thread::new::thread_start::h9b6324e2391e6ebb + 48 46 libsystem_pthread.dylib 0x18baeb2e4 _pthread_start + 136 47 libsystem_pthread.dylib 0x18bae60fc thread_start + 8 ```

Edit This is the command I used for running zellij:

cargo x run -- -l crash.kdl --debug --config /dev/null
imsnif commented 1 month ago

Hey @dj95 - thanks for following through. I might have to use you a little bit for troubleshooting here, as I am unfortunately not reproducing. I have a suspicion that this can be solved by: https://github.com/zellij-org/zellij/issues/3036#issuecomment-2024214955 (i.e. that it's an issue with the curl version that is being vendored during the compilation process). As a first step, could you follow the relevant instructions in the comment and give it a go to see if it helps?

imsnif commented 1 month ago

Just to update those following, the issue on macOS was solved through #3668 - thanks @dj95 !