teonite / casper-node

Reference client for CASPER protocol
https://casper.network
Apache License 2.0
0 stars 0 forks source link

Correct Handling of varuint1 Encoding in Wasm Parser #19

Closed przemyslaw closed 3 weeks ago

przemyslaw commented 4 weeks ago

A contract developer has triggered a bug in the in-house parity WebAssembly (Wasm) parser during a deployment, causing the parser to fail with the error: “Wasm preprocessing error: Deserialization error: Invalid table reference (128).” The issue arises from an incorrect handling of the call_indirect opcode in the WebAssembly (Wasm) parser, specifically related to the misinterpretation of a varuint1 field.

The current parser incorrectly parses a field in the call_indirect opcode as an unsigned 8-bit integer (u8) instead of the correct varuint1 format, which is LEB128 encoded. This leads to errors when the field value exceeds the expected range, as seen with a specific deployment where the value was 128 (0x80 in hexadecimal).

Info from Michał: "It seems there’s a contract developer that somehow managed to trigger a bug in the in-house parity wasm parser with this deploy: https://devnet.make.services/deploys/194988146d2e9beea5a0ad16245dbb15fa34d2ad5837f27192cbb280c38761a3

This wasm triggers parser error “Wasm preprocessing error: Deserialization error: Invalid table reference (128)”

I investigated it with Rafal from CL and it seems the parser in casper-wasm is buggy. According to the spec call_indirect opcode is encoded with a varuint32 field and a varuint1 field (ref: https://github.com/WebAssembly/design/blob/main/Semantics.md#calls). Now, it seems stefan.wasm the value in the call_indirect opcode inside wasm is 128 (in decimal); and the wasm parser parses this incorrectly as u8.

From what I understand varuint1 is leb128 encoded and 128 in dec is 0x80. And 0x80 decoded in leb128 form gioves 0. For whatever reasons all the wasms had literal 0 (all bits 0) encoded, but this one is 128 (0 encoded in leb128 form)

This is the location in the parser where this reserved field is parsed as u8, not varuint1: https://github.com/casper-network/casper-wasm/blob/a38e184578a06210652df6f6d26f6a56303611f7/src/elements/ops.rs#L1088-L1096. There’s Varuint1 type in the code, but I think it’s incorrect - it parses incoming byte as u8, but doesn’t parse it as leb128"

wojcik91 commented 3 weeks ago

This behavior is related to changes in rust toolchain. It was introduced in the nightly channel between 2024-08-01 and 2024-08-02: https://github.com/rust-lang/rust/compare/f8060d282...28a58f2fa

There doesn't seem to be any WASM-specific code there, so the most likely culprit seems to be the LLVM update from 18 to 19.

wojcik91 commented 3 weeks ago

In the current version of the parser call_indirect instruction is represented as CallIndirect(signature, table_ref), where the first argument is a reference to a pre-defined function type and the second argument is a reference to a table containing the function to be called. During parsing it is assumed that the second argument is always equal to 0. This is probably because in the initial WebAssembly MVP a default table was used and actual implementation of this feature was meant to be done in the future. The second argument is also discarded when processing this instruction in casper-wasmi and instead a constant is used as table index.

wojcik91 commented 3 weeks ago

Using non-default table references seems to be a part of reference-types proposal: https://github.com/WebAssembly/reference-types

This proposal has also been enabled for wasm32-unknown-unknown target for some time now: https://github.com/rust-lang/rust/blob/master/src/doc/rustc/src/platform-support/wasm32-unknown-unknown.md

It's possible that it did not have an effect until LLVM was updated.

wojcik91 commented 3 weeks ago

To properly handle current variant of call_indirect we'd have to implement support for reference-types. This feature in turn depends at least in some part on existing support (at least in the parser) for bulk-memory-operations feature.

Both of those feature have been a part of official WebAssembly spec since early 2021: https://github.com/WebAssembly/spec/commit/7fa2f20a6df4cf1c114582c8cb60f5bfcdbf1be1

'reference_typesis also enabled by default inwasm32-unknown-unknown` Rust target: https://github.com/rust-lang/rust/blob/f734f3d8e73f2118fdf4bdaf35e72ace8a442531/src/doc/rustc/src/platform-support/wasm32-unknown-unknown.md#enabled-webassembly-features

At this point in time Casper team is fine with casper-wasmi targeting the MVP WebAssembly spec and asking people to compile their contracts using a pinned nightly version (2024-08-01) and it was decided that implementing those features would be too time-consuming.

If we decide to follow through with the implementation at some point in the future we'd probably have to refactor our parser to also support bulk-memory-operations in order to update the WebAssembly spec repository used for testing. Currently casper-wasm is using a version from 2019 and casper-wasmi from 2020. A way to sidestep this issue would be to use a revision from a point in time before those proposals were merged into the official spec and the tests still lived in the proposals directory.