hotg-ai / rune

Rune provides containers to encapsulate and deploy edgeML pipelines and applications
Apache License 2.0
134 stars 15 forks source link

Memory access out of bounds while running the Gesture Rune #42

Closed Michael-F-Bryan closed 3 years ago

Michael-F-Bryan commented 3 years ago

It looks like the gesture rune triggers a memory out-of-bounds error inside _call().

When I ran it in GDB, I saw that wasmer caught a segfault in the middle of a bunch of WebAssembly code. The backtrace was entirely useless because we're in the middle of some JIT-compiled code that doesn't have debug info and we can't see any stack frames above where we enter the WebAssembly.

This might be a bug in wasmer.

Alternatively, I could have messed up the way we stash the pipeline in a global variable (the static PIPELINE: Option<Box<dyn FnMut()>>) so we try to interpret garbage memory as a function pointer.

Steps to Reproduce ```console $ cd hotg-ai/rune && git checkout ae5b65778bd8913b14f418ab8ba398a10899c2e6 $ cargo rune build examples/gesture/Runefile Finished release [optimized] target(s) in 0.87s Running `target/release/rune build examples/gesture/Runefile` [2021-03-03T06:52:00.677Z DEBUG rune::build] Parsing "examples/gesture/Runefile" warning: Unknown type ┌─ examples/gesture/Runefile:11:20 │ 11 │ PROC_BLOCK label hotg-ai/rune#proc_blocks/ohv_label --labels=Wing,Ring,Slope,Unknown │ ^^^^ [2021-03-03T06:52:00.678Z DEBUG rune::build] Compiling gesture in "/home/michael/.cache/runes/gesture" [2021-03-03T06:52:00.681Z DEBUG rune_codegen] Executing "cargo" "build" "--target=wasm32-unknown-unknown" "--quiet" "--release" [2021-03-03T06:52:01.779Z DEBUG rune::build] Generated 62500 bytes $ RUST_BACKTRACE=1 cargo rune run examples/gesture/gesture.rune Finished release [optimized] target(s) in 0.05s Running `target/release/rune run examples/gesture/gesture.rune` [2021-03-03T06:52:34.792Z INFO rune::run] Running rune: examples/gesture/gesture.rune [2021-03-03T06:52:34.792Z DEBUG rune_runtime::runtime] Compiling the WebAssembly to native code [2021-03-03T06:52:34.800Z DEBUG rune_runtime::runtime] Instantiating the WebAssembly module [2021-03-03T06:52:34.801Z DEBUG rune_runtime::context] Requested capability RAND with ID 1 [2021-03-03T06:52:34.801Z DEBUG rune_runtime::context] Setting n=Integer(384) on capability 1 [2021-03-03T06:52:34.801Z DEBUG rune_runtime::context] Loaded model 2 with inputs [TensorInfo { name: "conv2d_input", element_kind: kTfLiteFloat32, dims: [1, 128, 3, 1] }] and outputs [TensorInfo { name: "Identity", element_kind: kTfLiteFloat32, dims: [1, 4] }] [2021-03-03T06:52:34.801Z DEBUG rune_runtime::runtime] Loaded the Rune [2021-03-03T06:52:34.801Z INFO rune::run] Call 0 [2021-03-03T06:52:34.801Z DEBUG rune_runtime::runtime] Running the rune Error: Call failed Caused by: 0: Unable to call the _call function 1: Error when calling invoke: A `memory out-of-bounds access` trap was thrown at code offset 0 Stack backtrace: 0: rune_runtime::runtime::runtime_error 1: rune_runtime::runtime::Runtime::call 2: rune::main 3: std::sys_common::backtrace::__rust_begin_short_backtrace 4: std::rt::lang_start::{{closure}} 5: core::ops::function::impls:: for &F>::call_once at /rustc/4f20caa6258d4c74ce6b316fd347e3efe81cf557/library/core/src/ops/function.rs:259:13 std::panicking::try::do_call at /rustc/4f20caa6258d4c74ce6b316fd347e3efe81cf557/library/std/src/panicking.rs:379:40 std::panicking::try at /rustc/4f20caa6258d4c74ce6b316fd347e3efe81cf557/library/std/src/panicking.rs:343:19 std::panic::catch_unwind at /rustc/4f20caa6258d4c74ce6b316fd347e3efe81cf557/library/std/src/panic.rs:431:14 std::rt::lang_start_internal at /rustc/4f20caa6258d4c74ce6b316fd347e3efe81cf557/library/std/src/rt.rs:51:25 6: main 7: __libc_start_main 8: _start ```
Michael-F-Bryan commented 3 years ago

It looks like the wasmer-runtime crate was removed from wasmer when they hit 1.0. I'll try migrating from wasmer-runtime v0.17 to wasmer v1.0 and see if that fixes the issue.

kthakore commented 3 years ago

Dang this is a blocker for us. :|

Michael-F-Bryan commented 3 years ago

@kthakore is the C++ runtime also having this issue?