Better debugging support for runtime safety errors

aran commented 2 weeks ago

Is your feature request related to a problem? Please describe. I'm facing errors like the following, and finding it challenging to trace back to the code or object that has the safety issue.

Failed to initialize: Uncaught RuntimeError: unreachable
(anonymous )
logError
imports.wbg._wbg_error_19172d49b5d0849d
$flutter_rust_bridge::misc::web_utils::js_console_error::h4664335edd9b6c51
$flutter_rust_bridge::third_party::wasm_bindgen::worker_pool::WorkerPool: :new:: ({closure)}::hfe803290904c45e9
$<dyn core::ops:: function::FnMut<(A,)>+0utput = R as wasm_bindgen:: closure: :WasmClosure>:: describe:: invoke: :h2a0ebdf1693839bd
_wbg_adapter_36
real

rust_lib.js:581 panicked at /Users/aran/.cargo/registry/src/index.crates.io-6f17d22bba15001f/flutter_rust_bridge-2.0.0-dev.32/src/rust_auto_opaque/dart2rust.rs:17:14:
Fail to borrow object. Please ensure the object is not borrowed mutably elsewhere at the same time, which violates Rust's rules.: TryLockError(())

Stack:

Error
    at http://localhost:8000/pkg/rust_lib.js:587:21
    at logError (http://localhost:8000/pkg/rust_lib.js:243:18)
    at imports.wbg.__wbg_new_abda76e883ba8a5f (http://localhost:8000/pkg/rust_lib.js:586:66)
    at rust_lib.wasm.console_error_panic_hook::Error::new::h554cb2a05d32379c (http://localhost:8000/pkg/rust_lib_bg.wasm:wasm-function[16816]:0x5f0f27)
    at rust_lib.wasm.console_error_panic_hook::hook_impl::h87698accfb0c2024 (http://localhost:8000/pkg/rust_lib_bg.wasm:wasm-function[3295]:0x3ad8de)
    at rust_lib.wasm.console_error_panic_hook::hook::hb18f4458ce62f773 (http://localhost:8000/pkg/rust_lib_bg.wasm:wasm-function[23782]:0x666ec1)
    at rust_lib.wasm.core::ops::function::Fn::call::h2b792758c46c363c (http://localhost:8000/pkg/rust_lib_bg.wasm:wasm-function[20699]:0x63a376)
    at rust_lib.wasm.<alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call::h9a6abdb0a2af1e06 (http://localhost:8000/pkg/rust_lib_bg.wasm:wasm-function[16668]:0x5edbe6)
    at rust_lib.wasm.flutter_rust_bridge::misc::panic_backtrace::PanicBacktrace::setup::{{closure}}::hcb77c123d39213d4 (http://localhost:8000/pkg/rust_lib_bg.wasm:wasm-function[4536]:0x4149b6)
    at rust_lib.wasm.<alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call::h9a6abdb0a2af1e06 (http://localhost:8000/pkg/rust_lib_bg.wasm:wasm-function[16668]:0x5edbe6)

    Uncaught (in promise) RuntimeError: unreachable
    at rust_lib.wasm.panic_abort::__rust_start_panic::abort::h304c545e427f7c64 (rust_lib_bg.wasm:0x6916a9)
    at rust_lib.wasm.__rust_start_panic (rust_lib_bg.wasm:0x683c9c)
    at rust_lib.wasm.rust_panic (rust_lib_bg.wasm:0x2dda77)
    at rust_lib.wasm.std::panicking::rust_panic_with_hook::hff999e7138f45af6 (rust_lib_bg.wasm:0x126be0)
    at rust_lib.wasm.std::panicking::begin_panic_handler::{{closure}}::h057cbdabc2d6ab77 (rust_lib_bg.wasm:0x39d7c4)
    at rust_lib.wasm.std::sys_common::backtrace::__rust_end_short_backtrace::h620ef835227fa895 (rust_lib_bg.wasm:0x689c6f)
    at rust_lib.wasm.rust_begin_unwind (rust_lib_bg.wasm:0x4f682e)
    at rust_lib.wasm.core::panicking::panic_fmt::hf5e89e7ab20f07e3 (rust_lib_bg.wasm:0x56d7e4)
    at rust_lib.wasm.core::result::unwrap_failed::hdcfce5d5a8ef304b (rust_lib_bg.wasm:0x3bf0dd)
    at rust_lib.wasm.core::result::Result<T,E>::expect::h21c3e34f0339f62a (rust_lib_bg.wasm:0x506b70)

Describe the solution you'd like

Somehow, we need a way to increase the probability of statically catching these issues, a bit clearer guidelines on how to avoid them, and more runtime support to help find what callsites are responsible for issues. On static analysis, I'm not sure if there's a team around the dart analyzer, but maybe there's a collaboration possible around getting a linear type checker into it.

In particular there's this warning in the docs:

As expected, the MyNonEncodableType, &MyNonEncodableType, &mut MyNonEncodableType means the standard Rust ownership things - owned, borrowed, mutable borrowed. For example, in normal Rust, we cannot mutably borrow the same object twice at the same time. When doing so for RustAutoOpaque objects, you will receive a runtime error.

In short, just write normal Rust code, and you are safe. Anything that violates Rust's model or safety will be caught and provide a runtime error, instead of the dangerous undefined behavior.

The issue I think I'm running into is in writing Dart code that uses that normal Rust code, figuring out what's safe or not. I'm not 100% sure, but it seems possible to have working Dart code (that freely stores and copies some data) get turned unsafe for example by changing the Rust side of a type from a translated struct to a RustAutoOpaque.

Describe alternatives you've considered N/A

fzyzcjy commented 2 weeks ago

Firstly, I guess it would be great if we show stack traces (for both Dart and Rust) when that panic happens. Indeed for native code it works great; IIRC it is the limitation of Web that makes panic handling not very elegant.

Btw, IIRC the panic may report full Rust stack trace. Could you please try to enable full stacktrace and see whether that provides you more info?

On static analysis, I'm not sure if there's a team around the dart analyzer, but maybe there's a collaboration possible around getting a linear type checker into it.

Possibly related: https://github.com/fzyzcjy/flutter_rust_bridge/issues/1641 (cc @Desdaemon).

Possibly related for overall issue: https://github.com/fzyzcjy/flutter_rust_bridge/issues/1883 (It's on my radar but has not have enough time to think about it deeply)

Desdaemon commented 2 weeks ago

In lieu of a static solution which might take some time to develop (personally I'm also very interested in this area), being able to expand the stack trace back to the offending code should be a halfway decent solution. I don't know how to do this off the top of my head, but it might require some cooperation from flutter_rust_bridge as well.

aran commented 2 weeks ago

@fzyzcjy I'm not sure how to get more info. I saw in the docs the link to https://stackoverflow.com/questions/69593235/how-to-get-panic-information-i-e-stack-trace-with-catch-unwind, but it looks like this is implemented directly already.

I have an unchanged init.rs:

#[flutter_rust_bridge::frb(init)]
pub fn init_app() {
    // Default utilities - feel free to customize
    flutter_rust_bridge::setup_default_user_utils();
}

In main.dart, I call await RustLib.init(); very early in initialization. I enabled the "log" feature to confirm this init_app is being called.

aran commented 2 weeks ago

More broadly, I think what I maybe actually want is a supported way to avoid the runtime safety errors in the first place. Any chance there are flags I can use, or FRB APIs that I can avoid, so there are no possible runtime borrow checker issues?

fzyzcjy commented 2 weeks ago

I'm not sure how to get more info.

Firstly cc @Desdaemon - who implemented the whole web part from scratch, and IIRC he has mentioned some limitations of web panics (do they still hold today / is there any workarounds?).

fzyzcjy commented 2 weeks ago

so there are no possible runtime borrow checker issues?

If you always use &SomeType and avoid &mut SomeType, I guess this will never happen: Fail to borrow object. Please ensure the object is not borrowed mutably elsewhere at the same time, which violates Rust's rules.

aran commented 2 weeks ago

That's helpful (And also challenging, because I am keeping mutable data in Rust.) Before, I had RustOpaque<std::sync::Mutex> everywhere, so I manually called my own .lock(), and didn't have this issue somehow. This doesn't work anymore because of the trait bound std::sync::Mutex<MyData>: MoiArcValue is not satisfied.

For the static analysis, I inquired with the dart team: https://github.com/dart-lang/sdk/issues/55657.

aran commented 2 weeks ago

Two additional notes:

I noticed the backtrace feature and explicitly enabled it, but didn't see an effect.
I confirmed the connection to &mut and specifically the combination of `&mut with an async method.

Does something like this make sense? I have a Dart object dartStorage, whose contents are a Rust object rustStorage. I invoke a Dart command on it like await modifyContentsOfStorage(dartStorage.rustStorage);. At some point, this acquires the RwLock. But other awaitables might run, like await readContentsOfStorage(dartStorage.rustStorage). If FRB's lock acquisition runtimes aren't tight, and an awaitable might yield while holding a lock, I see the error.

I changed my &mut methods to frb(sync), and no longer see the borrow-related panics. But unfortunately that breaks the workaround for dart-lang/language#1910. I'll file a separate issue.

Desdaemon commented 2 weeks ago

I'm not sure how relevant this is to your situation, but FRB uses try_read and try_write which are the weakest assumptions possible about whether it is possible to acquire a lock.

While trywrite fails even if there is a single other active borrower (high contention) it's the only acquisition mode that makes sense for both async and sync functions. If you need reliable lock acquisition, you might want to use `RustOpaque<RwLock<>>` and manually acquire the lock yourself, at the cost of some ergonomics.

aran commented 2 weeks ago

Is it possible to still use RustOpaque<RwLock<_>>? I get errors about conflicting implementations when trying to use tokio::sync::RwLock.

error[E0119]: conflicting implementations of trait `MoiArcValue` for type `tokio::sync::RwLock<simple::OpaqueData>`
   --> src/frb_generated.rs:205:1
    |
204 |   flutter_rust_bridge::frb_generated_moi_arc_impl_value!(RwLock<OpaqueData>);
    |   -------------------------------------------------------------------------- first implementation here
205 | / flutter_rust_bridge::frb_generated_moi_arc_impl_value!(
206 | |     flutter_rust_bridge::for_generated::rust_async::RwLock<OpaqueData>
207 | | );
    | |_^ conflicting implementation for `tokio::sync::RwLock<simple::OpaqueData>`
    |
    = note: this error originates in the macro `flutter_rust_bridge::frb_generated_moi_arc_impl_value` (in Nightly builds, run with -Z macro-backtrace for more info)

Desdaemon commented 2 weeks ago

Oh yeah I did have this issue before, I think it's that once you use one type as an auto-opaque type you can't also use it inside RustOpaque anymore. cc @fzyzcjy

fzyzcjy commented 2 weeks ago

This doesn't work anymore because of the trait bound std::sync::Mutex: MoiArcValue is not satisfied.

IIRC try tokio's Mutex.

Does something like this make sense? I have a Dart object dartStorage, whose contents are a Rust object rustStorage. I invoke a Dart command on it like await modifyContentsOfStorage(dartStorage.rustStorage);. At some point, this acquires the RwLock. But other awaitables might run, like await readContentsOfStorage(dartStorage.rustStorage). If FRB's lock acquisition runtimes aren't tight, and an awaitable might yield while holding a lock, I see the error.

That looks pretty reasonable (and we need to find out a solution)

I think it's that once you use one type as an auto-opaque type you can't also use it inside RustOpaque anymore. cc @fzyzcjy

This looks like a bug. A workaround is to avoid such scenario (e.g. always use auto or non-auto; or add a wrapper type struct B(A); to differentiate both cases). But feel free to PR for it!

To support it, I am not sure what is the best semantics: A naive solution will introduce two exactly same named dart types, causing a lot of conflicts.

aran commented 2 weeks ago

To support it, I am not sure what is the best semantics: A naive solution will introduce two exactly same named dart types, causing a lot of conflicts.

Spitballing a little, maybe FRB could generate more than one dart package to allow these conflicts to be resolved. mymodule/synchronized/mytype.dart and mymodule/mytype.dart

fzyzcjy / flutter_rust_bridge

Better debugging support for runtime safety errors #1914