paritytech / wasm-instrument

Instrument and transform wasm modules.
Apache License 2.0
47 stars 15 forks source link

Switch to `wasmparser` and `wasm-encoder` #3

Open athei opened 2 years ago

athei commented 2 years ago

Currently, we use parity-wasm both as a parser and encoder library for web assembly. However, this library is kind of outdated and only supports the wasm MVP properly.

The plan is to switch to the battle tested libraries wasmparser (for parsing) and wasm-encoder (for writing out the modified wasm). The main problem here is that those libraries don't have the required no_std support. So this is blocked until they have that or until an up-to-date fork exists. @Robbepop maintains such a fork here. However, it only added no_std support for wasmparser but it should be easy to support it for wasm-encoder (in the same repo) in a similar manner: https://github.com/Robbepop/wasm-tools/pull/1

Right now we re-export the parity-wasm crate so that our users can directly interact with those types. Some users (pallet-contract) even use those types implement some constraints and checks themselves on wasm modules. The same concept won't be possible because wasmparser is event driven and has no in-memory representation of a whole module like parity-wasm does.

Another hurdle is that we currently rely on the inability of parity-wasm to parse anything more than the wasm MVP to guard us against wasm modules that make use of features we don't want to support. I think this is a mistake: The parser should support everything that is standardized and this crate should represent the bottle neck of what we want to support. This is a natural place because we need to deal with all features in detail in order to implement a proper instrumentation.

This switch should shift the overall design so that we can accomplish all the checking and instrumentation steps in one pass over the module: We expose a single entry point like fn process<C: Config>(config: C). This Config will configure what is to be done in this single pass:

The points listed above are basically what pallet-contracts is doing right now by directly depending on parity-wasm directly: https://github.com/paritytech/substrate/blob/master/frame/contracts/src/wasm/prepare.rs . We want to get rid of that code and move it to wasm-instrument through the Config we described above.

Doing this all in one pass and possibly in parallel (for std builds) could yield massive speedups. Right now the pallet-contracts needs multiple passes over a module to achieve all of that.

Robbepop commented 2 years ago

For this you could use the already existing no_std fork of the wasmparser crate that can be found here on crates.io..

Furthermore you will need to do the same with their wasm-encoder crate that sits in the same wasm-tools Rust workspace by the BytecodeAlliance. If the author of this issue is unfamiliar with no_std Rust a look into the wasmparser-nostd branch for version 0.83 will sure help.

For using the wasmparser API a look into wasmi_v1's usage of wasmparser might be helpful, too. :)

coolreader18 commented 2 years ago

This is how wasm-tools (the suite that wasmparser/wasm-encoder is from) translates from the former to the latter: https://github.com/bytecodealliance/wasm-tools/blob/main/crates/wasm-mutate/src/mutators/translate.rs

Would be nice if that was extracted into a crate or something somehow