rerun-io / rerun

Visualize streams of multimodal data. Fast, easy to use, and simple to integrate. Built in Rust using egui.
https://rerun.io/
Apache License 2.0
6.24k stars 288 forks source link

SIGSEGV on startup on M1 #3847

Open max-cura opened 11 months ago

max-cura commented 11 months ago

Describe the bug Running the example from the Rust Quick Start page segfaults on 0.9.0, 0.9.1, and 0.10.0-alpha.5. Running rerun-cli segfaults on 0.9.1, and 0.10.0-alpha.5 (rerun-cli@0.9.0 failed to build), and the current main branch (commit 432d7d2). In both cases, running the binary from inside lldb results in normal function.

To Reproduce cargo install rerun-cli@{any of the versions given above} and then simply rerun. Note that /usr/bin/lldb rerun runs without faulting.

The first example from https://www.rerun.io/docs/getting-started/rust can also be used, which will also generate a segfault.

Expected behavior No segfaults on startup.

Backtrace Stack trace cargo install rerun-cli@0.9.1, RUST_LOG=debug rerun

``` [2023-10-12T19:47:14Z DEBUG re_memory::memory_limit] Setting memory limit to 24.0 GiB, which is 75% of total available memory (32.0 GiB). [2023-10-12T19:47:14Z INFO re_sdk_comms::server] Hosting a SDK server over TCP at 0.0.0.0:9876. Connect with the Rerun logging SDK. [2023-10-12T19:47:14Z DEBUG eframe] Using the wgpu renderer [2023-10-12T19:47:14Z DEBUG eframe::native::run] Entering the winit event loop (run_return)… [2023-10-12T19:47:14Z DEBUG eframe::native::file_storage] Loading app state from "/Users/bosporos/Library/Application Support/rerun/app.ron"… [2023-10-12T19:47:14Z DEBUG eframe::epi] Failed to decode RON: 1:474: Unexpected missing field `generation` in `SerializedElement` [2023-10-12T19:47:14Z DEBUG egui_winit::clipboard] Initializing arboard clipboard… [2023-10-12T19:47:14Z DEBUG re_viewer::native] wgpu adapter name: "Apple M1 Max", device_type: IntegratedGpu, backend: Metal, driver: "", driver_info: "" Rerun caught a signal: SIGSEGV Troubleshooting Rerun: https://www.rerun.io/docs/getting-started/troubleshooting Report bugs: https://github.com/rerun-io/rerun/issues 2: re_crash_handler::install_signal_handler::signal_handler 3: _OSAtomicTestAndClearBarrier 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: eframe::native::epi_integration::EpiIntegration::update 16: ::run_ui_and_paint 17: eframe::native::run::run_and_return::{{closure}} 18: as winit::platform_impl::platform::app_state::EventHandler>::handle_nonuser_event 19: winit::platform_impl::platform::app_state::Handler::handle_nonuser_event 20: winit::platform_impl::platform::app_state::AppState::cleared 21: std::panicking::try 22: winit::platform_impl::platform::observer::control_flow_end_handler 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: winit::platform_impl::platform::event_loop::EventLoop::run_return 34: eframe::native::run::with_event_loop 35: eframe::native::run::wgpu_integration::run_wgpu 36: eframe::run_native Rerun caught a signal: SIGSEGV Troubleshooting Rerun: https://www.rerun.io/docs/getting-started/troubleshooting Report bugs: https://github.com/rerun-io/rerun/issues ```

Stack trace cargo install rerun-cli@0.10.0-alpha.5, RUST_LOG=debug rerun.

``` [2023-10-12T19:43:58Z DEBUG re_memory::memory_limit] Setting memory limit to 24.0 GiB, which is 75% of total available memory (32.0 GiB). [2023-10-12T19:43:58Z INFO re_sdk_comms::server] Hosting a SDK server over TCP at 0.0.0.0:9876. Connect with the Rerun logging SDK. [2023-10-12T19:43:58Z DEBUG eframe] Using the wgpu renderer [2023-10-12T19:43:58Z DEBUG eframe::native::run] Entering the winit event loop (run_return)… [2023-10-12T19:43:58Z DEBUG eframe::native::file_storage] Loading app state from "/Users/bosporos/Library/Application Support/rerun/app.ron"… [2023-10-12T19:43:58Z DEBUG eframe::epi] Failed to decode RON: 1:474: Unexpected missing field `generation` in `SerializedElement` [2023-10-12T19:43:58Z DEBUG egui_winit::clipboard] Initializing arboard clipboard… [2023-10-12T19:43:58Z DEBUG re_viewer::native] wgpu adapter name: "Apple M1 Max", device_type: IntegratedGpu, backend: Metal, driver: "", driver_info: "" Rerun caught a signal: SIGSEGV Troubleshooting Rerun: https://www.rerun.io/docs/getting-started/troubleshooting Report bugs: https://github.com/rerun-io/rerun/issues 2: re_crash_handler::install_signal_handler::signal_handler 3: _OSAtomicTestAndClearBarrier 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: eframe::native::app_icon::AppTitleIconSetter::update 16: eframe::native::epi_integration::EpiIntegration::update 17: ::run_ui_and_paint 18: eframe::native::run::run_and_return::{{closure}} 19: as winit::platform_impl::platform::app_state::EventHandler>::handle_nonuser_event 20: winit::platform_impl::platform::app_state::Handler::handle_nonuser_event 21: winit::platform_impl::platform::app_state::AppState::cleared 22: std::panicking::try 23: winit::platform_impl::platform::observer::control_flow_end_handler 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: winit::platform_impl::platform::event_loop::EventLoop::run_return 35: eframe::native::run::with_event_loop 36: eframe::native::run::wgpu_integration::run_wgpu 37: eframe::run_native Rerun caught a signal: SIGSEGV Troubleshooting Rerun: https://www.rerun.io/docs/getting-started/troubleshooting Report bugs: https://github.com/rerun-io/rerun/issues [1] 90663 segmentation fault RUST_LOG=debug rerun ```

Full output from https://github.com/rerun-io/rerun with cargo build -p rerun-cli and then RUST_LOG=debug target/debug/rerun:

``` [2023-10-12T19:36:21Z DEBUG re_memory::memory_limit] Setting memory limit to 24.0 GiB, which is 75% of total available memory (32.0 GiB). [2023-10-12T19:36:21Z INFO re_sdk_comms::server] Hosting a SDK server over TCP at 0.0.0.0:9876. Connect with the Rerun logging SDK. [2023-10-12T19:36:21Z DEBUG eframe] Using the wgpu renderer [2023-10-12T19:36:21Z DEBUG eframe::native::run] Entering the winit event loop (run_return)… [2023-10-12T19:36:21Z DEBUG eframe::native::file_storage] Loading app state from "/Users/bosporos/Library/Application Support/rerun/app.ron"… [2023-10-12T19:36:21Z DEBUG eframe::epi] Failed to decode RON: 1:474: Unexpected missing field `generation` in `SerializedElement` [2023-10-12T19:36:21Z DEBUG egui_winit::clipboard] Initializing arboard clipboard… [2023-10-12T19:36:21Z DEBUG re_viewer::native] wgpu adapter name: "Apple M1 Max", device_type: IntegratedGpu, backend: Metal, driver: "", driver_info: "" Rerun caught a signal: SIGSEGV Troubleshooting Rerun: https://www.rerun.io/docs/getting-started/troubleshooting Report bugs: https://github.com/rerun-io/rerun/issues re_crash_handler::install_signal_handler::signal_handler at re_crash_handler/src/lib.rs:171:25 4: _OSAtomicTestAndClearBarrier 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: eframe::native::app_icon::set_title_and_icon_mac at eframe-0.23.0/src/native/app_icon.rs:225:28 eframe::native::app_icon::set_title_and_icon at eframe-0.23.0/src/native/app_icon.rs:64:12 eframe::native::app_icon::AppTitleIconSetter::update at eframe-0.23.0/src/native/app_icon.rs:25:27 17: eframe::native::epi_integration::EpiIntegration::update at eframe-0.23.0/src/native/epi_integration.rs:521:9 18: ::run_ui_and_paint at eframe-0.23.0/src/native/run.rs:1372:17 19: eframe::native::run::run_and_return::{{closure}} at eframe-0.23.0/src/native/run.rs:169:17 20: as winit::platform_impl::platform::app_state::EventHandler>::handle_nonuser_event::{{closure}} winit::platform_impl::platform::app_state::EventLoopHandler::with_callback at winit-0.28.7/src/platform_impl/macos/app_state.rs:70:13 as winit::platform_impl::platform::app_state::EventHandler>::handle_nonuser_event at winit-0.28.7/src/platform_impl/macos/app_state.rs:91:9 21: winit::platform_impl::platform::app_state::Handler::handle_nonuser_event at winit-0.28.7/src/platform_impl/macos/app_state.rs:199:21 22: winit::platform_impl::platform::app_state::AppState::cleared at winit-0.28.7/src/platform_impl/macos/app_state.rs:388:13 23: winit::platform_impl::platform::observer::control_flow_end_handler::{{closure}} at winit-0.28.7/src/platform_impl/macos/observer.rs:79:21 winit::platform_impl::platform::observer::control_flow_handler::{{closure}} at winit-0.28.7/src/platform_impl/macos/observer.rs:41:9 std::panicking::try::do_call at std/src/panicking.rs:500:40 std::panicking::try at std/src/panicking.rs:464:19 std::panic::catch_unwind at std/src/panic.rs:142:14 winit::platform_impl::platform::event_loop::stop_app_on_panic at winit-0.28.7/src/platform_impl/macos/event_loop.rs:245:11 winit::platform_impl::platform::observer::control_flow_handler at winit-0.28.7/src/platform_impl/macos/observer.rs:39:5 winit::platform_impl::platform::observer::control_flow_end_handler at winit-0.28.7/src/platform_impl/macos/observer.rs:74:9 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: winit::platform_impl::platform::event_loop::EventLoop::run_return::{{closure}} at winit-0.28.7/src/platform_impl/macos/event_loop.rs:220:22 objc2::rc::autorelease::autoreleasepool at objc2-0.3.0-beta.3.patch-leaks.3/src/rc/autorelease.rs:313:5 winit::platform_impl::platform::event_loop::EventLoop::run_return at winit-0.28.7/src/platform_impl/macos/event_loop.rs:211:25 35: as winit::platform::run_return::EventLoopExtRunReturn>::run_return at winit-0.28.7/src/platform/run_return.rs:51:9 eframe::native::run::run_and_return at eframe-0.23.0/src/native/run.rs:147:16 eframe::native::run::wgpu_integration::run_wgpu::{{closure}} at eframe-0.23.0/src/native/run.rs:1566:17 eframe::native::run::with_event_loop::{{closure}} at eframe-0.23.0/src/native/run.rs:130:9 std::thread::local::LocalKey::try_with at std/src/thread/local.rs:270:16 std::thread::local::LocalKey::with at std/src/thread/local.rs:246:9 eframe::native::run::with_event_loop at eframe-0.23.0/src/native/run.rs:124:16 36: eframe::native::run::wgpu_integration::run_wgpu at eframe-0.23.0/src/native/run.rs:1563:13 eframe::run_native at eframe-0.23.0/src/lib.rs:233:13 Rerun caught a signal: SIGSEGV Troubleshooting Rerun: https://www.rerun.io/docs/getting-started/troubleshooting Report bugs: https://github.com/rerun-io/rerun/issues [1] 90506 segmentation fault RUST_LOG=debug target/debug/rerun ```

Desktop (please complete the following information):

Rerun version 0.9.0, 0.9.1, 0.10.0-alpha.5, main@commit 432d7d2.

nikolausWest commented 11 months ago

Thanks for the bug report @max-cura, and sorry you had that happen!

Would you mind seeing if you can run through the python quick start on your system?

max-cura commented 11 months ago

Also segfaults;

I ran pip3 install rerun-sdk then ran python3 -m rerun_demo, which gave

Rerun caught a signal: SIGSEGV
Troubleshooting Rerun: https://www.rerun.io/docs/getting-started/troubleshooting
Report bugs: https://github.com/rerun-io/rerun/issues

Rerun caught a signal: SIGSEGV
Troubleshooting Rerun: https://www.rerun.io/docs/getting-started/troubleshooting
Report bugs: https://github.com/rerun-io/rerun/issues

/opt/homebrew/lib/python3.11/site-packages/rerun_sdk/rerun_demo/__main__.py:24: DeprecationWarning: Please migrate to `rr.log(…, rr.Points2D(…))` or `rr.log(…, rr.Points3D(…))`.
  See: https://www.rerun.io/docs/reference/migration-0-9 for more details.
  rr.log_points("cube", positions=cube.positions, colors=cube.colors, radii=0.5)

EDIT

Again, when I ran inside LLDB, it functioned as normal:

/usr/bin/lldb $(which python3) -o 'r -m rerun_demo' -o 'c'

Wumpf commented 11 months ago

We tested 0.9.1 on MacOs 14.0 and 13.4 but I don't think anyone else tried 12.5. Given that it crashes inside a setting the app icon (eframe::native::app_icon::AppTitleIconSetter::update) it's not unlikely that this is about your older Mac version. I know it's a big ask, but would you mind trying if the problem persists after updating your system?

nikolausWest commented 11 months ago

Hi again @max-cura, thanks for checking with Python! Before potentially upgrading your OS, perhaps you could try out a simpler eframe based app that also uses a custom app icon like https://crates.io/crates/puffin_viewer just to confirm our suspicion?

max-cura commented 11 months ago

Yeah, installed puffin_viewer and it indeed has the same pattern (segfault on start, no segfault if started in lldb). I'll try updating my system tonight.

max-cura commented 11 months ago

Upgraded to macOS 14.0.

Tried to run the Rust rerun-cli again, got this:

[2023-10-14T05:12:49Z INFO  re_sdk_comms::server] Hosting a SDK server over TCP at 0.0.0.0:9876. Connect with the Rerun logging SDK.

Rerun caught a signal: SIGBUS
Troubleshooting Rerun: https://www.rerun.io/docs/getting-started/troubleshooting
Report bugs: https://github.com/rerun-io/rerun/issues

   2: re_crash_handler::install_signal_handler::signal_handler
   3: __platform_memmove
   4: <unknown>
   5: <unknown>
   6: <unknown>
   7: <unknown>
   8: <unknown>
   9: <unknown>
  10: <unknown>
  11: <unknown>
  12: <unknown>
  13: <unknown>
  14: <unknown>
  15: eframe::native::epi_integration::EpiIntegration::update
  16: <eframe::native::run::wgpu_integration::WgpuWinitApp as eframe::native::run::WinitApp>::run_ui_and_paint
  17: eframe::native::run::run_and_return::{{closure}}
  18: <winit::platform_impl::platform::app_state::EventLoopHandler<T> as winit::platform_impl::platform::app_state::EventHandler>::handle_nonuser_event
  19: winit::platform_impl::platform::app_state::Handler::handle_nonuser_event
  20: winit::platform_impl::platform::app_state::AppState::cleared
  21: std::panicking::try
  22: winit::platform_impl::platform::observer::control_flow_end_handler
  23: <unknown>
  24: <unknown>
  25: <unknown>
  26: <unknown>
  27: <unknown>
  28: <unknown>
  29: <unknown>
  30: <unknown>
  31: <unknown>
  32: <unknown>
  33: winit::platform_impl::platform::event_loop::EventLoop<T>::run_return
  34: eframe::native::run::with_event_loop
  35: eframe::native::run::wgpu_integration::run_wgpu
  36: eframe::run_native

Rerun caught a signal: SIGBUS
Troubleshooting Rerun: https://www.rerun.io/docs/getting-started/troubleshooting
Report bugs: https://github.com/rerun-io/rerun/issues

[1]    3848 bus error  rerun

Error persisted after reinstalling rerun-cli. Same as before, no error when starting from lldb.

EDIT:

puffin_viewer displayed the same symptom.

nikolausWest commented 11 months ago

Thanks for upgrading your system and trying again. This is pretty bad indeed

emilk commented 11 months ago

The icon code is here: https://github.com/emilk/egui/blob/master/crates/eframe/src/native/app_icon.rs

Interesting that you get both a SIGSEGV and a SIGBUS

emilk commented 10 months ago

I wonder what makes @max-cura:s computer different from mine. Have you tried attaching a debugger @max-cura?

nikolausWest commented 10 months ago

@emilk: If you read further up on the issue, max wrote that attaching the debugger removed the problem.