wasmi-labs / wasmi

WebAssembly (Wasm) interpreter.
https://wasmi-labs.github.io/wasmi/
Apache License 2.0
1.62k stars 289 forks source link

Implement backtraces in `wasmi` #538

Open Robbepop opened 2 years ago

Robbepop commented 2 years ago

Even though wasmi is an interpreter VM for Wasm we are missing proper debugging functionality such as backtraces. This is a heavily requested feature by smart contract users so that they can finally properly debug and analyze their smart contract exeuctions.

Having backtraces is not sufficient by its own since we also are required to show proper function and local variable names for a nice debugging UX. For this purpose we also need to support the Wasm name section that provides a Wasm VM with those names for a given Wasm blob.

Furthermore any backtrace implementation for wasmi should ideally not conflict with the performance in case those backtraces are not required. Therefore we need to find a design that has zero cost for non-debug executions.

@cmichi

ToDos

Related Work

Robbepop commented 2 years ago

Past discussions on wasmi execution debugging: https://github.com/paritytech/wasmi/issues/28

orsinium commented 3 months ago

It might be a better idea to generate a coredump on trap. Then the debug information doesn't need to be attached to the wasm binary and parsed by wasmi. That's one of the debug modes that is supported by wasmtime and wasmer.

Would you be interested in me sponsoring your work on this issue? I'm making Firefly Zero, a wasmi-powered handheld game console, and tracebacks is the most requested feature gamedevs have so far. With the current implementation, if a trap occurs, I can only provide information about the guest function that failed and the last called host function.

Robbepop commented 3 months ago

Hi @orsinium, very cool to hear about Firefly Zero and that it uses Wasmi. :) I absolutely love the mascot and the idea.

I personally have not yet taken a look into the technical aspects of coredumps and what is going to be necessary for Wasmi to support producing them when encountering a trap but will take a look as I think it is a very valuable addition to Wasmi.

What do you mean by sponsoring exactly? Implementing it yourself and sponsoring a PR? (great if it meets the quality) Or sponsoring this feature monetarily for me to implement?

orsinium commented 3 months ago

Implementing it yourself and sponsoring a PR? (great if it meets the quality) Or sponsoring this feature monetarily for me to implement?

The latter. I looked around to see if there is a link to financially sponsor you or wasmi and haven't found any. I want to support the most important project that powers Firefly Zero, especially if it can bring closer some of the important features like the one in question.

I could also look into implementing coredumps for wasmi myself but I'll have the capacity only in October.

orsinium commented 3 months ago

I personally have not yet taken a look into the technical aspects of coredumps and what is going to be necessary for Wasmi to support producing them

It shouldn't be that hard! This repo contains some Rust libraries and tools:

https://github.com/xtuc/wasm-coredump

The most important one for us is wasm-coredump-builder:

https://docs.rs/wasm-coredump-builder/0.1.22/wasm_coredump_builder/

The docs above have a code example:

let mut coredump_builder = wasm_coredump_builder::CoredumpBuilder::new()
        .executable_name("/usr/bin/true.exe");

{
    let mut thread_builder = wasm_coredump_builder::ThreadBuilder::new()
        .thread_name("main");

    let coredump_frame = wasm_coredump_builder::FrameBuilder::new()
        .codeoffset(123)
        .funcidx(6)
        .build();
    thread_builder.add_frame(coredump_frame);

    coredump_builder.add_thread(thread_builder.build());
}

let coredump = coredump_builder.serialize().unwrap();

So, all you need to provide it is the instruction and function offset for each frame in the trace. Then people can use wasmgdb from the same repo to map the coredump to a DWARF to produce a proper traceback.

Robbepop commented 3 months ago

Interesting, just today I was thinking about making use of GitHub sponsoring system but wasn't sure if anybody would actually use it. 🤔 Need to think about it.

Thanks for the wasm_coredump_builder link. Looks simple, indeed. Unfortunately it looks as if it does not (yet) support no_std and thus cannot be used in Wasmi unless std mode is enabled. This is probably natural since it creates files, but technically a coredump could be stored differently on no_std platforms, I wonder how or if there was a good proven alternative or if the solution is to just not do anything as is the case right now.

orsinium commented 3 months ago

It's not that hard to add no_std support in there! There is not that much code, and seems like only std::io::Write is used not out of core or alloc. I've opened an issue: https://github.com/xtuc/wasm-coredump/issues/6

orsinium commented 3 months ago

https://github.com/xtuc/wasm-coredump/pull/7

orsinium commented 3 months ago

https://github.com/gimli-rs/leb128/issues/25

orsinium commented 3 months ago

I drafted a PR for lb128 as well:

https://github.com/gimli-rs/leb128/pull/26

It's not as pretty as I wish it was and there are a few small things left to implement. Let's see what the Gimli core team folks say.

orsinium commented 3 months ago
  1. Bad news: none of that is merged yet.
  2. Good news: I gave a shot to implementing coredumps encoding from scratch, with less allocations and no_std, and it's under 100 lines of code.
  3. Bad news: I use wasm_encoder so far, and it's not no_std, see https://github.com/paritytech/polkadot-sdk/issues/118. I might need to write all the needed bytes directly.
Robbepop commented 3 months ago

Hi @orsinium,

that's pretty cool news! Looking forward to see a PR. :) Maybe together we can hash out some plans how to deal with the TODOs you mentioned.

I think the Bytecode Alliance is more open to no_std than they were back in 2022. So maybe this is closer to reality than it looks. They are currently trying to get no_std support in many of their tools and even Cranelift.

orsinium commented 3 months ago

I managed to drop wasm_encoder! Now the whole thing is 126 lines. I'll test it next week and then it's ready.

orsinium commented 3 months ago

The code is ready! With a few quick tests, it seems like my implementation works even better than wasm-coredump-builder. I've made it no_std, zero-dependency, with far fewer allocations, and even found and fixed a few bugs along the way.

Now, how do you want to proceed? I can make a separate repo and crate or we can keep things simple and put it right into wasmi. I don't mind either option, I do it not for fame.