Seeker14491 / opener

Open a file or link in the system default program.
Apache License 2.0
54 stars 11 forks source link

"Reveal" feature causes the app to segfault on linux in some cases #27

Closed qu1ck closed 6 months ago

qu1ck commented 11 months ago

I am using this lib in my multiplatform app and I got a report from one of my linux users that the app started segfaulting once I started using the reveal feature. Note that the app crashes on start, not when the opener code is called. The crash is in cpp land, not in rust too, there is no rust papnic message or backtrace on crash. Here is gdb backtrace

Thread 1 "trgui-ng" received signal SIGSEGV, Segmentation fault.
___pthread_mutex_lock (mutex=0x0) at pthread_mutex_lock.c:80
80        unsigned int type = PTHREAD_MUTEX_TYPE_ELISION (mutex);               
(gdb) backtrace
#0  ___pthread_mutex_lock (mutex=0x0) at pthread_mutex_lock.c:80
#1  0x00007fffe2fd96cc in _dbus_platform_cmutex_lock (mutex=<optimized out>) at /usr/src/debug/dbus/dbus/dbus/dbus-sysdeps-pthread.c:153
#2  _dbus_lock (lock=_DBUS_LOCK_server_slots) at /usr/src/debug/dbus/dbus/dbus/dbus-threads.c:348
#3  _dbus_data_slot_allocator_alloc (slot_id_p=0x7fffeb6e2014 <server_slot>, allocator=0x7fffe2fff060 <slot_allocator.lto_priv>) at /usr/src/debug/dbus/dbus/dbus/dbus-dataslot.c:75
#4  dbus_server_allocate_data_slot (slot_p=slot_p@entry=0x7fffeb6e2014 <server_slot>) at /usr/src/debug/dbus/dbus/dbus/dbus-server.c:1100
#5  0x00007fffeb6bdb8b in atspi_dbus_server_setup_with_g_main (server=server@entry=0x5555579efae0, context=0x0) at ../at-spi2-core/atspi/atspi-gmain.c:615
#6  0x00007fffee7778ce in spi_atk_create_socket (app=0x555557674f60) at ../at-spi2-core/atk-adaptor/bridge.c:958
#7  0x00007fffee77e73e in impl_get_app_bus (bus=<optimized out>, msg=0x5555576d81a0, data=<optimized out>) at ../at-spi2-core/atk-adaptor/adaptors/application-adaptor.c:96
#8  0x00007fffee7820e5 in handle_other
    (pathstr=0x5555579eba78 "/org/a11y/atspi/accessible/root", member=0x5555579ebad8 "GetApplicationBusAddress", iface=<optimized out>, path=0x555557702c80, message=0x5555576d81a0, bus=0x55555769cb20) at ../at-spi2-core/droute/droute.c:558
#9  handle_message (bus=0x55555769cb20, message=0x5555576d81a0, user_data=0x555557702c80) at ../at-spi2-core/droute/droute.c:605
#10 0x0000555555e410b5 in _dbus_object_tree_dispatch_and_unlock (tree=0x5555576cddb0, message=0x5555576d81a0, found_object=0x7fffffff5250) at ./vendor/dbus/dbus/dbus-object-tree.c:1021
#11 0x0000555555e33485 in dbus_connection_dispatch (connection=0x55555769cb20) at ./vendor/dbus/dbus/dbus-connection.c:4742
#12 0x00007fffeb6bb68b in message_queue_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at ../at-spi2-core/atspi/atspi-gmain.c:89
#13 0x00007ffff050df69 in g_main_dispatch (context=0x5555575b1cc0) at ../glib/glib/gmain.c:3476
#14 0x00007ffff056c327 in g_main_context_dispatch_unlocked (context=0x5555575b1cc0) at ../glib/glib/gmain.c:4284
#15 g_main_context_iterate_unlocked.isra.0 (context=context@entry=0x5555575b1cc0, block=block@entry=0, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/glib/gmain.c:4349
#16 0x00007ffff050c162 in g_main_context_iteration (context=0x5555575b1cc0, context@entry=0x0, may_block=may_block@entry=0) at ../glib/glib/gmain.c:4414
#17 0x00007ffff2ded057 in gtk_main_iteration_do (blocking=0) at ../gtk/gtk/gtkmain.c:1457
#18 0x0000555556802a19 in gtk::auto::functions::main_iteration_do (blocking=false) at src/auto/functions.rs:405
#19 0x00005555559f5894 in tao::platform_impl::platform::event_loop::{impl#1}::run_return::{closure#0}<tauri_runtime_wry::Message<tauri::EventLoopMessage>, tauri_runtime_wry::{impl#48}::run::{closure_env#0}<tauri::EventLoopMessage, tauri::app::{impl#18}::run::{closure_env#0}<tauri_runtime_wry::Wry<tauri::EventLoopMessage>, trguing::main::{closure_env#0}>>> ()
    at /home/jerry/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tao-0.16.2/src/platform_impl/linux/event_loop.rs:1060
#20 0x0000555555d08e51 in glib::auto::main_context::MainContext::with_thread_default<i32, tao::platform_impl::platform::event_loop::{impl#1}::run_return::{closure_env#0}<tauri_runtime_wry::Message<tauri::EventLoopMessage>, tauri_runtime_wry::{impl#48}::run::{closure_env#0}<tauri::EventLoopMessage, tauri::app::{impl#18}::run::{closure_env#0}<tauri_runtime_wry::Wry<tauri::EventLoopMessage>, trguing::main::{closure_env#0}>>>> (self=0x7fffffff65c8, func=...) at /home/jerry/.cargo/registry/src/index.crates.io-6f17d22bba15001f/glib-0.15.12/src/main_context.rs:156
#21 0x00005555559f540e in tao::platform_impl::platform::event_loop::EventLoop<tauri_runtime_wry::Message<tauri::EventLoopMessage>>::run_return<tauri_runtime_wry::Message<tauri::EventLoopMessage>, tauri_runtime_wry::{impl#48}::run::{closure_env#0}<tauri::EventLoopMessage, tauri::app::{impl#18}::run::{closure_env#0}<tauri_runtime_wry::Wry<tauri::EventLoopMessage>, trguing::main::{closure_env#0}>>> (self=0x7fffffff6690, callback=...) at /home/jerry/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tao-0.16.2/src/platform_impl/linux/event_loop.rs:958
#22 0x00005555559f67cd in tao::platform_impl::platform::event_loop::EventLoop<tauri_runtime_wry::Message<tauri::EventLoopMessage>>::run<tauri_runtime_wry::Message<tauri::EventLoopMessage>, tauri_runtime_wry::{impl#48}::run::{closure_env#0}<tauri::EventLoopMessage, tauri::app::{impl#18}::run::{closure_env#0}<tauri_runtime_wry::Wry<tauri::EventLoopMessage>, trguing::main::{closure_env#0}>>> (self=<error reading variable: Cannot access memory at address 0x0>, callback=<error reading variable: Cannot access memory at address 0x0>)
    at /home/jerry/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tao-0.16.2/src/platform_impl/linux/event_loop.rs:912
#23 0x0000555555a0b3b9 in tao::event_loop::EventLoop<tauri_runtime_wry::Message<tauri::EventLoopMessage>>::run<tauri_runtime_wry::Message<tauri::EventLoopMessage>, tauri_runtime_wry::{impl#48}::run::{closure_env#0}<tauri::EventLoopMessage, tauri::app::{impl#18}::run::{closure_env#0}<tauri_runtime_wry::Wry<tauri::EventLoopMessage>, trguing::main::{closure_env#0}>>>
    (self=..., event_handler=<error reading variable: Cannot access memory at address 0x0>)
    at /home/jerry/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tao-0.16.2/src/event_loop.rs:179
#24 0x0000555555949975 in tauri_runtime_wry::{impl#48}::run<tauri::EventLoopMessage, tauri::app::{impl#18}::run::{closure_env#0}<tauri_runtime_wry::Wry<tauri::EventLoopMessage>, trguing::main::{closure_env#0}>> (self=..., callback=...) at /home/jerry/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tauri-runtime-wry-0.14.0/src/lib.rs:2255
#25 0x0000555555781a46 in tauri::app::App<tauri_runtime_wry::Wry<tauri::EventLoopMessage>>::run<tauri_runtime_wry::Wry<tauri::EventLoopMessage>, trguing::main::{closure_env#0}>
    (self=..., callback=...) at /home/jerry/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tauri-1.4.0/src/app.rs:875
#26 0x00005555558e970e in trguing::main () at src/main.rs:202
#27 0x0000555555b55b9b in core::ops::function::FnOnce::call_once<fn(), ()> () at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/ops/function.rs:250
#28 0x0000555555c78dae in std::sys_common::backtrace::__rust_begin_short_backtrace<fn(), ()> (f=0x5555558e6310 <trguing::main>)
--Type <RET> for more, q to quit, c to continue without paging--
    at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/sys_common/backtrace.rs:154
#29 0x0000555555d37941 in std::rt::lang_start::{closure#0}<()> () at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/rt.rs:166
#30 0x0000555556d42e5b in core::ops::function::impls::{impl#2}::call_once<(), (dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)>
    (self=..., args=()) at library/core/src/ops/function.rs:284
#31 std::panicking::try::do_call<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> (data=<optimized out>)
    at library/std/src/panicking.rs:502
#32 std::panicking::try<i32, &(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> (f=...) at library/std/src/panicking.rs:466
#33 std::panic::catch_unwind<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> (f=...) at library/std/src/panic.rs:142
#34 std::rt::lang_start_internal::{closure#2} () at library/std/src/rt.rs:148
#35 std::panicking::try::do_call<std::rt::lang_start_internal::{closure_env#2}, isize> (data=<optimized out>) at library/std/src/panicking.rs:502
#36 std::panicking::try<isize, std::rt::lang_start_internal::{closure_env#2}> (f=...) at library/std/src/panicking.rs:466
#37 std::panic::catch_unwind<std::rt::lang_start_internal::{closure_env#2}, isize> (f=...) at library/std/src/panic.rs:142
#38 std::rt::lang_start_internal (main=..., argc=<optimized out>, argv=<optimized out>, sigpipe=<optimized out>) at library/std/src/rt.rs:148
#39 0x0000555555d3791a in std::rt::lang_start<()> (main=0x5555558e6310 <trguing::main>, argc=1, argv=0x7fffffffdda8, sigpipe=0)
    at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/rt.rs:165
#40 0x00005555558e9ede in main ()
#41 0x00007ffff02f9cd0 in __libc_start_call_main (main=main@entry=0x5555558e9ec0 <main>, argc=argc@entry=1, argv=argv@entry=0x7fffffffdda8) at ../sysdeps/nptl/libc_start_call_main.h:58
#42 0x00007ffff02f9d8a in __libc_start_main_impl
    (main=0x5555558e9ec0 <main>, argc=1, argv=0x7fffffffdda8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdd98) at ../csu/libc-start.c:360
#43 0x00005555556fec95 in _start ()
(gdb) 

Original bug report https://github.com/openscopeproject/TrguiNG/issues/105 has more info

User has this issue on their arch system. I can not reproduce on debian.

Any clue why this may be happening? Can it be a bug in dbus-rs package?

Seeker14491 commented 11 months ago

No idea what caused the issue. It does seem dbus related given that backtrace, but I don't have any dbus internals expertise that would give me an idea of why this crash happened. Given that the original issue was resolved by reinstalling Arch, I'm going to assume it was some issue with the Arch install, and will close this issue.

qu1ck commented 7 months ago

I believe root cause of the crash is statically linked dbus lib. See more here https://internals.rust-lang.org/t/global-symbols-from-statically-linked-system-libraries/19954

It causes some global variables to not be initialized correctly which could manifest into a crash with a stack trace above.

Is there any reason why you use vendored feature here? https://github.com/Seeker14491/opener/blob/ff6e2599f1360332dac2b87877f9e1694ad654c3/opener/Cargo.toml#L31

My understanding is that without it the lib will be not be statically linked so it may solve the above crash. Would it create other problems?

Seeker14491 commented 7 months ago

The reason for using the vendored feature is to avoid needing dbus's usual dependencies to both build opener, and run executables that use opener.

qu1ck commented 7 months ago

I don't understand how static linking helps avoid any dependencies. Doesn't the other end of what dbus talks to need dbus dependencies anyway?

Can you make it optional at least?

Seeker14491 commented 7 months ago

On the development side, the vendored feature avoids needing dbus's library dependencies for compiling the library. On the user side, the vendored feature lets them run the built binaries, even if they lack the dbus runtime library (though that's rare). In that case the dbus functionality won't work, but they can still run the program.

I can make the vendored dbus feature optional, by adding a new feature flag to opener that controls it. I think I can also keep vendoring as the default, and therefore avoid making this a breaking change.

qu1ck commented 7 months ago

I can make the vendored dbus feature optional, by adding a new feature flag to opener that controls it. I think I can also keep vendoring as the default, and therefore avoid making this a breaking change.

That would resolve the issue for me.

Seeker14491 commented 7 months ago

Alright, I implemented this in a9dc2eead17c4a1a9adc2fa0398e04fdd6f474d5. It's technically a breaking change though if someone was using default-features = false for some reason. There's another unrelated PR I want to resolve (https://github.com/Seeker14491/opener/pull/28), then I plan on making a release.

As for the true underlying issue, if it's a Rust or Cargo bug we should open an issue in the appropriate place, if one hasn't been made.

qu1ck commented 7 months ago

I'm not well versed enough in linker intricacies to file proper bug report with reproducible example, I'm just going off Nvidia engineer's analysis in the rust-lang forum thread I linked above. That thread just got closed for inactivity, sadly.

If it helps, I believe the whole investigation started from this webkitgtk bug report https://bugs.webkit.org/show_bug.cgi?id=261874#c57 (link to nvidia driver dev's comment explaining crash in my app)

Seeker14491 commented 6 months ago

I've released opener v0.7.0.

username227 commented 6 months ago

Please note that with the new version, the program will only render properly when started WITHOUT the environment variable contained in the .desktop file. If I start the latest commit with the .desktop file then it renders a white screen. Therefore, please push out a corrected desktop file. If the aur pkgbuild needs updating, i can do the same thing as last time, just tell me the direct download link to use in the pkgbuild. Thanks.

qu1ck commented 6 months ago

@username227 please open an issue for that on TrguiNG issue tracker, it is not relevant to this repo.