bytecodealliance / wasm-tools

CLI and Rust libraries for low-level manipulation of WebAssembly modules
Apache License 2.0
1.34k stars 245 forks source link

wasmparser accepts some malformed modules #1671

Open keithw opened 3 months ago

keithw commented 3 months ago

In the legacy-exceptions proposal, this module is malformed:

(func catch_all)

but wasmparser accepts it. This prevents passing some of the assert_malformed tests in the legacy-exceptions tests. (The resulting modules do fail validation.)

alexcrichton commented 3 months ago

cc @trzeciak would you be willing to take a look at this?

alexcrichton commented 3 months ago

Or, actually, I think this may be working as intended mostly:

$ cargo run -q validate --features legacy-exceptions <<< "(module (func catch_all))"
error: func 0 failed to validate

Caused by:
    0: catch_all found outside of a `try` block (at offset 0x17)

I think the problem is that these tests are asserting for "unexpected token" which are specifically about the s-expression-form of these instructions in the text parser. So I think this may be related to the text-parser implementation of the legacy exceptions proposal rather than the validator?

keithw commented 3 months ago

Yeah, I agree the problem is with the text parser -- these are supposed to be malformed (not just invalid). I think ideally the validator would not be getting run at all on the assert_malformed tests.

FWIW, it's possible this may not be specific to exceptions. This test also passes in the reference interpreter and WABT but not in current wasm-tools:

(assert_malformed (module quote "(func end)") "unexpected token")
alexcrichton commented 3 months ago

Ah yeah the exact shape of an error and which stage of the wasm-tools tooling reports an error is often somewhat different than the spec tests. There's a large method for "matching" errors as a result of this. For example in your (func end) example in wast (the text-to-binary parser) the end instruction isn't treated specially and we're not tracking depth or such so it emits a wasm function with an extra end instruction so the error is in the validator not the text parser. That's similar to (func catch_all) where catch_all is considered a normal instruction and so that generates an invalid binary rather than returning a text error.

Much of this comes about from wast probably being structured differently than the reference interpreter and wabt. Reflecting the precise error at the precise same stage of the text-to-validation pipeline probably isn't the hardest thing in the world but it largely just hasn't been done up to this point.

trzeciak commented 3 months ago

Sorry for the delay (PTO), if I understand correctly, this is nothing new related to the legacy-exceptions proposal. My experience in this lib and/or rust is too little to be able to do the mentioned refactor :/