Closed tims-bsquare closed 12 months ago
I was just testing wasmoon out today and hit some unexpected behavior caused by 32-bit ints, so I am excited to see this issue being worked on. Unfortunately, I'm hitting a segfault on your branch (which doesn't occur on main). I'm still working on trying to find a minimal script that reproduces this, but let me know if you have any ideas or suggestions on how to get a more useful stack trace. I have bisected it down to being caused by the LUA_INT_DEFAULT
change (and not the thread.ts change or other build.sh changes) at least.
Aborted(segmentation fault)
/home/wasmoon/dist/index.js:1663
/** @suppress {checkTypes} */ var e = new WebAssembly.RuntimeError(what);
^
RuntimeError: Aborted(segmentation fault)
at abort (/home/wasmoon/dist/index.js:1663:44)
at segfault (/home/wasmoon/dist/index.js:1474:6)
at glue.wasm.SAFE_HEAP_STORE_i32_4_4 (wasm://wasm/glue.wasm-006c350e:wasm-function[1373]:0xc2972)
at glue.wasm.emmalloc_memalign (wasm://wasm/glue.wasm-006c350e:wasm-function[1282]:0xbf8cd)
at glue.wasm.emmalloc_aligned_realloc (wasm://wasm/glue.wasm-006c350e:wasm-function[1286]:0xbfddc)
at glue.wasm.emmalloc_realloc (wasm://wasm/glue.wasm-006c350e:wasm-function[1288]:0xc0155)
at glue.wasm.l_alloc (wasm://wasm/glue.wasm-006c350e:wasm-function[185]:0x13d7d)
at glue.wasm.luaM_malloc_ (wasm://wasm/glue.wasm-006c350e:wasm-function[667]:0x55311)
at glue.wasm.luaC_newobjdt (wasm://wasm/glue.wasm-006c350e:wasm-function[490]:0x3b43a)
at glue.wasm.luaC_newobj (wasm://wasm/glue.wasm-006c350e:wasm-function[491]:0x3b5b6)
Turns out that crash is triggered by me wrapping require
in a JS function. Feel free to ignore that as it's probably not a recommended thing to do, and without this hook, everything is working perfectly for me, but here's a simplified version of what I was doing for reference (along with some reasonably-complex lua code that does a bunch of require
's):
const origRequire = lua.global.get('require');
lua.global.set('require', path => origRequire(path));
That's a funky error, I'll take a look
I've tried to write a test to reproduce it but it passes, could you try and tweak it @Sapu94 until it fails please?
it('wrap require function', async () => {
const factory = getFactory()
await factory.mountFile('test.lua', 'return 10')
const engine = await factory.createEngine()
const origRequire = engine.global.get('require');
engine.global.set('require', path => origRequire(path));
await engine.doString(`require('test')`)
})
It could also be related to this https://github.com/emscripten-core/emscripten/issues/13131
Just getting back to this after being away for the past week. It seems like this can only be reproduced by getting the Lua VM / wasm stack into some specific state. Sorry in advance for the huge can of worms this seems to be 😅. Here's as far as I've gotten in coming up with a minimal set of reproduction steps (I'm doing all this in WSL for what it's worth):
~/wasmoon
and switch to the branch for this PRnpm run build:wasm:docker:dev && npm run build
~/wasmoon-test
and install dependencies with npm install
(the wasmoon dependency is set as file:../wasmoon
so it uses the local clone)npx ts-node main.ts
Here's the latest error:
$ npx ts-node main.ts
Aborted(segmentation fault)
/home/sapu/wasmoon/dist/index.js:1663
/** @suppress {checkTypes} */ var e = new WebAssembly.RuntimeError(what);
^
RuntimeError: Aborted(segmentation fault)
at abort (/home/sapu/wasmoon/dist/index.js:1663:44)
at segfault (/home/sapu/wasmoon/dist/index.js:1474:6)
at glue.wasm.SAFE_HEAP_STORE_i32_1_1 (wasm://wasm/glue.wasm-006bc9de:wasm-function[1358]:0xc15d3)
at glue.wasm.luaV_execute (wasm://wasm/glue.wasm-006bc9de:wasm-function[1070]:0xaedc4)
at glue.wasm.ccall (wasm://wasm/glue.wasm-006bc9de:wasm-function[435]:0x33fe4)
at glue.wasm.luaD_callnoyield (wasm://wasm/glue.wasm-006bc9de:wasm-function[436]:0x340be)
at glue.wasm.f_call (wasm://wasm/glue.wasm-006bc9de:wasm-function[105]:0xa4b5)
at invoke_vii (/home/sapu/wasmoon/dist/index.js:6285:31)
at glue.wasm.luaD_rawrunprotected (wasm://wasm/glue.wasm-006bc9de:wasm-function[416]:0x2fc40)
at glue.wasm.luaD_pcall (wasm://wasm/glue.wasm-006bc9de:wasm-function[447]:0x35745)
Let me know if you're able to reproduce the issue with these steps or not.
Here are a few of the things I've tried in my test repo that make me the crash go away:
TestLuaCode
folder to A
and update the require()'s accordinglyIncludeB.lua
up one level (to TestLuaCode/Lib
) and updating the require() accordingly{enableProxy: true}
insteadrequire
as mentioned aboveI don't actually think any of the above are super meaningful, other than to point to the issue being caused by things getting into a specific (corrupted) state as opposed to just one faulty function / line of code somewhere.
That's a great repro repo, thank you for putting that together! :) I think what you encountered was an actual proper segfault where the library was trying to reference a free'd value and I suspect sometimes the memory may not yet have been corrupted depending on those small changes you listed.
I've pushed another change to this branch, unrelated to the misbehaving integers hopefully fixing this for you (it does for me locally).
Something to be aware of is that there's a potential memory leak with doFile/callByteCode that's necessary to avoid segfaults, which was and is done by moving the result value from the execution thread to the global. The code was returning a reference to the value in the thread which had been cleaned up rather than the copy in the global state. Anyway that means if you intend to call doFile regularly with results being returned then it's important to either reset your global state or pop the result from the top of the stack with lua.global.pop()
That fix works great, thanks! Good to know about the memory leak.
Great job @tims-bsquare! Can I merge it or are you still working on this PR?
No problem, should be ready to merge @ceifa :) Let me add that TODO first though