bytecodealliance / wasmtime

A fast and secure runtime for WebAssembly
https://wasmtime.dev/
Apache License 2.0
14.82k stars 1.23k forks source link

Spurious panic in v1.21 "Cannot start a runtime from within a runtime." #8799

Closed milesj closed 3 weeks ago

milesj commented 3 weeks ago

I manage proto which utilizes a WASM plugin system powered by extism (which uses wasmtime under the hood). I was using extism v1.3, which was using wasmtime v17, without issue.

I'm trying to upgrade extism to v1.4, which bumps wasmtime to v21, but some plugin calls fail with the following:

Error:   × Main thread panicked.
  ├─▶ at /Users/miles/.cargo/registry/src/index.crates.io-6f17d22bba15001f/
  │   wasmtime-wasi-21.0.1/src/runtime.rs:94:15
  ╰─▶ Cannot start a runtime from within a runtime. This happens because a
      function (like `block_on`) attempted to block the current thread while
      the thread is being used to drive asynchronous tasks.
         0: 0x1048097c4 - rust_begin_unwind
                      at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/
      library/std/src/panicking.rs:652
         1: 0x1048d5ac0 - core::panicking::panic_fmt::hc2b459a5bd3dce66
                      at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/
      library/core/src/panicking.rs:72
         2: 0x1029145d0 -
      tokio::runtime::context::runtime::enter_runtime::h71c67f9ce90d7adb
                      at /Users/miles/.cargo/registry/src/index.crates.io-
      6f17d22bba15001f/tokio-1.38.0/src/runtime/context/runtime.rs:68
         3: 0x1029da4dc -
      tokio::runtime::handle::Handle::block_on::h1bf0d24a96667be4
                      at /Users/miles/.cargo/registry/src/index.crates.io-
      6f17d22bba15001f/tokio-1.38.0/src/runtime/handle.rs:309
         4: 0x1029babc0 - wasmtime_wasi::runtime::in_tokio::h86e6a68257414ba9
                      at /Users/miles/.cargo/registry/src/index.crates.io-
      6f17d22bba15001f/wasmtime-wasi-21.0.1/src/runtime.rs:94
         5: 0x10287b7f4 -
      wasmtime_wasi::preview1::sync::in_tokio::hfaa36b6685db3eb4
                      at /Users/miles/.cargo/registry/src/index.crates.io-
      6f17d22bba15001f/wasmtime-wasi-21.0.1/src/preview1.rs:797
         6: 0x102871864 -
      wasmtime_wasi::preview1::sync::add_wasi_snapshot_preview1_to_linker::
      {{closure}}::hb4d9d1d03b870150
                      at /Users/miles/.cargo/registry/src/index.crates.io-
      6f17d22bba15001f/wasmtime-wasi-21.0.1/src/preview1.rs:779
         7: 0x10282d308 - <F as wasmtime::runtime::func::IntoFunc<T,
      (wasmtime::runtime::func::Caller<T>,A1,A2,A3,A4,A5),R>>::into_func::nati
      ve_call_shim::{{closure}}::h6523ad1cf21333de
                      at /Users/miles/.cargo/registry/src/index.crates.io-
      6f17d22bba15001f/wasmtime-21.0.1/src/runtime/func.rs:2216
         8: 0x10297c33c -
      core::ops::function::FnOnce::call_once::h91938a2041e64111
                      at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/
      library/core/src/ops/function.rs:250
         9: 0x102841088 - wasmtime::runtime::func::Caller<T>::with::
      {{closure}}::hf3ae2a96081e3789
                      at /Users/miles/.cargo/registry/src/index.crates.io-
      6f17d22bba15001f/wasmtime-21.0.1/src/runtime/func.rs:2000
        10: 0x1028b314c -
      wasmtime::runtime::vm::instance::Instance::from_vmctx::hadc721fcc0c8ce82
                      at /Users/miles/.cargo/registry/src/index.crates.io-
      6f17d22bba15001f/wasmtime-21.0.1/src/runtime/vm/instance.rs:240
        11: 0x10283a7bc -
      wasmtime::runtime::func::Caller<T>::with::h7a637a6f69be75cc
                      at /Users/miles/.cargo/registry/src/index.crates.io-
      6f17d22bba15001f/wasmtime-21.0.1/src/runtime/func.rs:1996
        12: 0x10282cf58 - <F as wasmtime::runtime::func::IntoFunc<T,
      (wasmtime::runtime::func::Caller<T>,A1,A2,A3,A4,A5),R>>::into_func::nati
      ve_call_shim::{{closure}}::h13c08eccf960f6be
                      at /Users/miles/.cargo/registry/src/index.crates.io-
      6f17d22bba15001f/wasmtime-21.0.1/src/runtime/func.rs:2239
        13: 0x1029fe3ac - <core::panic::unwind_safe::AssertUnwindSafe<F> as
      core::ops::function::FnOnce<()>>::call_once::h9dbf965735c9e405
                      at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/
      library/core/src/panic/unwind_safe.rs:272
        14: 0x1029c3c30 - std::panicking::try::do_call::h4b6c28649817d04f
                      at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/
      library/std/src/panicking.rs:559
        15: 0x1029e4274 - ___rust_try
        16: 0x1029c30b8 - std::panicking::try::hc904728b212a80ec
                      at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/
      library/std/src/panicking.rs:523
        17: 0x10290c0b0 - std::panic::catch_unwind::hc1ef182edf83b816
                      at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/
      library/std/src/panic.rs:149
        18: 0x10288bee4 -
      wasmtime::runtime::vm::traphandlers::catch_unwind_and_longjmp::hfc27d178
      d428aaca
                      at /Users/miles/.cargo/registry/src/index.crates.io-
      6f17d22bba15001f/wasmtime-21.0.1/src/runtime/vm/traphandlers.rs:123
        19: 0x10282ce54 - <F as wasmtime::runtime::func::IntoFunc<T,
      (wasmtime::runtime::func::Caller<T>,A1,A2,A3,A4,A5),R>>::into_func::nati
      ve_call_shim::h6d324f8ac59a7ef9
                      at /Users/miles/.cargo/registry/src/index.crates.io-
      6f17d22bba15001f/wasmtime-21.0.1/src/runtime/func.rs:2237
        20: 0x10ad6e960 - <unresolved>

This is very confusing because both extism and proto are interacting with wasmtime through sync calls, so I'm not entirely sure how we're hitting tokio/async stuff. However, proto is using tokio itself to power the CLI, but wasm calls are non-async.

I tried stepping through the panic with a debugger, but I was unable to uncover where exactly it's happening. The debugger would constantly jump between tokio worker threads and I would lose context.

Test Case

N/A, use steps below

Steps to Reproduce

Expected Results

Does not panic.

Actual Results

Consistently panics.

For context, this only happens with the proto python plugin, and not the other proto plugins. Here's the python plugin function that panics: https://github.com/moonrepo/python-plugin/blob/master/src/proto.rs#L143

Versions and Environment

Wasmtime version or commit: 21

Operating system: macos (but fails on all)

Architecture: arm64 (but fails on all)

Extra Info

PR of failing builds: https://github.com/moonrepo/proto/actions/runs/9504911118/job/26198611886?pr=515

alexcrichton commented 3 weeks ago

Thanks for the report! If you're able to reproduce locally can you attach gdb and capture a backtrace of the thread at the time of the panic?

That backtrace should be able to go through wasm code and see everything and that's the main culprit to look at there. My hunch as to what's going on is that you're executing WebAssembly code on a Tokio event loop and that's what's causing the panic. This is just a hunch though so a stack trace would be able to help diagnose this. If this is the case then the best fix would likely be to offload the execution of WebAssembly to a worker thread not on the Tokio event loop

nilslice commented 3 weeks ago

@alexcrichton - thanks for the suggestion. Do you know if this happened to be a recent change where this function was introduced?

https://docs.rs/wasmtime-wasi/21.0.1/wasmtime_wasi/runtime/fn.with_ambient_tokio_runtime.html#

If so, it may be the solution we need to enable on-event-loop WebAssembly execution.

In Extism, there's a check for some needed extra runtime initialization (see: https://github.com/extism/extism/blob/c3e912dffb97269a5f305b6258ae0992e1d6e5c4/runtime/src/plugin.rs#L630-L633)

Perhaps it was previously the case that wasmtime_wasi assumed it could be executed from within an existing tokio runtime, and now it expects to be the parent tokio context?

pchickey commented 3 weeks ago

This change landed in the 19 release https://github.com/bytecodealliance/wasmtime/blob/release-19.0.0/RELEASES.md#changed https://github.com/bytecodealliance/wasmtime/pull/7933 when the new wasmtime-wasi implementation was promoted to the default (root of crate) and the legacy wasi-common was removed as an export from wasmtime-wasi.

In the new wasmtime-wasi implementation, tokio is required to be used either externally to wasmtime when using add_to_linker_async and the instantiate_async/call_async mode of entering wasm, or else it if you use add_to_linker_sync it creates its own private tokio runtime to run that same async code in each import function call. Tokio wont let you create a new runtime on the same stack that an existing runtime is on. So, if you wish to use wasmtime-wasi through a the synchronous wasmtime interface, you'll have to do so on a new stack (via thread creation as Alex suggests) or else switch to using the async wasmtime interface.

Your other alternative is to use the wasi-common implementation, but that is approaching its EOL at this point, so we don't recommend that as a long term solution.

alexcrichton commented 3 weeks ago

I think that this issue is currently working as intended in the sense that this is a consequence of the current design of wasmtime-wasi. Given that I'm going to close this, but @nilslice if there are still difficulties/issues remaining let me know and I can reopen too.

nilslice commented 3 weeks ago

Thanks! We ended up switching back to wasi-common to fix.