AssemblyScript / binaryen.js

A buildbot for browser & Node.js builds of Binaryen, a compiler infrastructure and toolchain library for WebAssembly.
http://github.com/WebAssembly/binaryen
Apache License 2.0
360 stars 50 forks source link

Valid module fails assertions during `mod.optimize()` #86

Closed csjh closed 1 year ago

csjh commented 1 year ago

I'm working on a compiler that uses binaryen.js to generate Wasm. A problem I've run into, however, is that running mod.optimize() on the generated module (which executes fine without optimization) runs into an assertion error. Running binaryen.readBinary(mod.emitBinary()).optimize() works fine, and it passes mod.validate(), so I'm not sure what the problem is. What could I be doing wrong that causes this?

The specific assertion failure is int(_id) == int(T::SpecificId). I can probably make a reproduction if totally necessary, but it'll take a bit.

CountBleck commented 1 year ago

Can you show the whole error?

csjh commented 1 year ago

Sure,

Aborted(Assertion failed: int(_id) == int(T::SpecificId), at: /home/runner/work/binaryen.js/binaryen.js/binaryen/src/wasm.h,781,cast)
221 |                 throw (
222 |                     (A.onAbort && A.onAbort(g),
223 |                     (g = "Aborted(" + g + ")"),
224 |                     P(g),
225 |                     (rA = !0),
226 |                     (g = new WebAssembly.RuntimeError(
                             ^
RuntimeError: Aborted(Assertion failed: int(_id) == int(T::SpecificId), at: /home/runner/work/binaryen.js/binaryen.js/binaryen/src/wasm.h,781,cast). Build with -sASSERTIONS for more info.
      at d (/Users/csjh/Projects/wc/node_modules/binaryen/index.js:226:25)
      at b (/Users/csjh/Projects/wc/node_modules/binaryen/index.js:1477:20)
      at BI (/Users/csjh/Projects/wc/node_modules/binaryen/index.js:6436:20)
      at WASM  ([wasm code])
      at WASM  ([wasm code])
      at EI (/Users/csjh/Projects/wc/node_modules/binaryen/index.js:6454:20)
      at WASM  ([wasm code])
      at WASM  ([wasm code])
CountBleck commented 1 year ago

Do you happen to have Emscripten installed on your machine already?

csjh commented 1 year ago

Yup, I have the homebrew version installed

CountBleck commented 1 year ago

Do you also have CMake, as well as either make or ninja (if so, which)?

csjh commented 1 year ago

I have all of the above, I'm assuming you're going to ask me to build a local copy w/ -sASSERTIONS?

CountBleck commented 1 year ago

Nope.

# make sure emcc and emcmake are available (maybe you need to source emsdk_env.sh)
git clone https://github.com/AssemblyScript/binaryen.js --recursive
cd binaryen.js
emcmake cmake -GNinja -DCMAKE_BUILD_TYPE=Debug -DCMAKE_EXE_LINKER_FLAGS="-sSTACK_SIZE=5242880" -B binaryen/build binaryen
ninja -C binaryen/build
npm i
npm run bundle
npm link # or use your desired package manager

Then, in your project:

npm link binaryen.js # or your desired package manager
NODE_OPTIONS="--enable-source-maps" npm run yourscript # or you can use `node --enable-source-maps` directly
csjh commented 1 year ago

Better error 🕺🏻

RuntimeError: Aborted(Assertion failed: int(_id) == int(T::SpecificId), at: /Users/csjh/Projects/wc/binaryen.js/binaryen/src/wasm.h,781,cast)
    at abort (file:///Users/csjh/Projects/binaryen.js/index.js:434:15)
    at ___assert_fail (file:///Users/csjh/Projects/binaryen.js/index.js:782:7)
    at binaryen_wasm.wasm.wasm::LocalSet* wasm::Expression::cast<wasm::LocalSet>() (wasm://wasm/binaryen_wasm.wasm-3a5c9792:wasm-function[1706]:0xe35a0)
    at binaryen_wasm.wasm.wasm::CoalesceLocals::applyIndices(std::__2::vector<unsigned int, std::__2::allocator<unsigned int>>&, wasm::Expression*) (wasm://wasm/binaryen_wasm.wasm-3a5c9792:wasm-function[36006]:0x7f6f0c)
    at invoke_viii (file:///Users/csjh/Projects/binaryen.js/index.js:5778:33)
    at binaryen_wasm.wasm.wasm::CoalesceLocals::doWalkFunction(wasm::Function*) (wasm://wasm/binaryen_wasm.wasm-3a5c9792:wasm-function[36002]:0x7f3be1)
    at binaryen_wasm.wasm.wasm::Walker<wasm::CoalesceLocals, wasm::Visitor<wasm::CoalesceLocals, void>>::walkFunctionInModule(wasm::Function*, wasm::Module*) (wasm://wasm/binaryen_wasm.wasm-3a5c9792:wasm-function[36153]:0x80500e)
    at binaryen_wasm.wasm.wasm::WalkerPass<wasm::LivenessWalker<wasm::CoalesceLocals, wasm::Visitor<wasm::CoalesceLocals, void>>>::runOnFunction(wasm::Module*, wasm::Function*) (wasm://wasm/binaryen_wasm.wasm-3a5c9792:wasm-function[36152]:0x804f6f)
    at invoke_viii (file:///Users/csjh/Projects/binaryen.js/index.js:5778:33)
    at binaryen_wasm.wasm.wasm::PassRunner::runPassOnFunction(wasm::Pass*, wasm::Function*) (wasm://wasm/binaryen_wasm.wasm-3a5c9792:wasm-function[24818]:0x603ecc)
CountBleck commented 1 year ago

Oh, thanks. I'll look into it.

CountBleck commented 1 year ago

Actually...I might want a minimal reproduction...

CountBleck commented 1 year ago

I should also note that this is likely to be a Binaryen bug, but they'll need some info to figure out what's up.

csjh commented 1 year ago

Aha! Finally figured out a probable cause. Seems to happen when I use an expression reference twice, which I'm now thinking might be taboo?

import binaryen from "binaryen";

const mod = new binaryen.Module();

+ let test;
const block = mod.block(null, [
    mod.block(null, [mod.local.set(0, mod.i32.wrap(mod.i64.const(0, 0)))]),
    mod.if(
+        (test = mod.i32.lt_s(mod.local.get(0, binaryen.i32), mod.i32.wrap(mod.i64.const(1, 0)))),
-       mod.i32.lt_s(mod.local.get(0, binaryen.i32), mod.i32.wrap(mod.i64.const(1, 0))),
        mod.block("WhileLoopContainerBlock$0", [
            mod.loop(
                "WhileLoopBody$0",
                mod.block(null, [
                    mod.block(
                        "Block$1",
                        [
                            mod.local.set(
                                0,
                                mod.i32.add(
                                    mod.local.get(0, binaryen.i32),
                                    mod.i32.wrap(mod.i64.const(1, 0))
                                )
                            )
                        ],
                        binaryen.auto
                    ),
+                    mod.br_if("WhileLoopBody$0", test)
-                    mod.br_if("WhileLoopBody$0", mod.i32.lt_s(mod.local.get(0, binaryen.i32), mod.i32.wrap(mod.i64.const(1, 0))))
                ])
            )
        ])
    ),
    mod.block(null, [mod.local.set(1, mod.i32.wrap(mod.i64.const(0, 0)))]),
    mod.loop(
        "DoWhileLoopBody$2",
        mod.block("DoWhileLoopContainerBlock$2", [
            mod.block(
                "Block$3",
                [
                    mod.local.set(
                        1,
                        mod.i32.add(
                            mod.local.get(1, binaryen.i32),
                            mod.i32.wrap(mod.i64.const(1, 0))
                        )
                    )
                ],
                binaryen.auto
            ),
            mod.br_if("DoWhileLoopBody$2", mod.i32.wrap(mod.i64.const(0, 0)))
        ])
    ),
    mod.return(mod.i32.add(mod.local.get(1, binaryen.i32), mod.local.get(0, binaryen.i32)))
]);

mod.addFunction("main", binaryen.createType([]), binaryen.i32, [binaryen.i32, binaryen.i32], block);

mod.addFunctionExport("main", "main");

mod.optimize();

Doesn't work with one loop isolated, so I'm assuming it has something to do with an optimization pass. Anyways, now I'm realizing this might be something I should be using mod.copyExpression for? Probably my fault, but still feels like an easy footgun.

CountBleck commented 1 year ago

Seems to happen when I use an expression reference twice, which I'm now thinking might be taboo?

Oh, that's extremely taboo.

csjh commented 1 year ago

Is there a way to validate a module in that regard? Surprised that mod.validate() or mod.emitBinary() wouldn't whine about it.

CountBleck commented 1 year ago

It should whine about it; see here. You are running mod.validate() before mod.optimize(), right?

csjh commented 1 year ago
import binaryen from "binaryen";

const mod = new binaryen.Module();

const increment = mod.local.set(0, mod.i32.add(mod.local.get(0, binaryen.i32), mod.i32.const(1)));
const block = mod.block(null, [
    increment,
    increment,
    mod.return(mod.local.get(0, binaryen.i32))
]);

mod.addFunction("main", binaryen.createType([]), binaryen.i32, [binaryen.i32], block);

mod.addFunctionExport("main", "main");

console.assert(mod.validate() !== 0, 'validation failed');

const binary = mod.emitBinary();
const module = new WebAssembly.Module(binary);
const instance = new WebAssembly.Instance(module);
// @ts-ignore
console.log(instance.exports.main());

This version hangs if I add a mod.optimize() but the validation doesn't catch anything

CountBleck commented 1 year ago

If it's a bug, it's Binaryen's, so you'll have to report it there.

dcodeIO commented 1 year ago

Using an expression twice is indeed the issue here. The optimizer expects unique expressions as it will modify these during optimizations. Otherwise, modifying one would also modify the other, likely producing something invalid - not just invalid in the WAT sense, but perhaps even invalid in memory, since just the allocation of expressions might be reused to make them something different.

CountBleck commented 1 year ago

Yeah, but he's now asking about mod.validate() not complaining (and it seems like it's supposed to).

dcodeIO commented 1 year ago

Afaik there are no checks for reused expressions, but producers and API users are expected to adhere to the convention. Might also be expensive to perform such a check.

CountBleck commented 1 year ago

See here. It seems like it's supposed to check that?

dcodeIO commented 1 year ago

Interesting, wasn't aware of this check. I don't remember the check ever triggering for me, though.