emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.82k stars 3.31k forks source link

Multiple main function definitions can cause emscripten to pass argc of 0 and argv of NULL, which can crash some programs #21687

Open alexbarry opened 7 months ago

alexbarry commented 7 months ago

edit: the issue is when defining one main function with signature int main(void), and another with int main(char* argv[], int argc). This seems to cause emscripten to pass argc of 0 and argv of NULL, which in my case happened to cause the Lua interpreter main function crash.


If I use emscripten to build the Lua interpreter (includes a main function) to wasm as a static library, and then build my own source files containing a second main function, and link to this Lua static library, I don't get any build time error. At runtime, I get Uncaught (in promise) RuntimeError: memory access out of bounds in chrome. And, more importantly, when trying to load the page as an electron app, it crashes the renderer process, which can be quite tricky to debug.

See https://alexbarry.github.io/misc/emscripten_multiple_mains/ for a demo of the issue, and the source is available here.

It looks like clang and gcc also don't give a build error in this case, but they don't give a runtime error either: they seem to just pick a main function and run it like normal.

I think the runtime error is new, for a while I had suppressed the Lua main function by defining my own, and I only started noticing the error recently (perhaps when upgrading emscripten). My assumption is that wasm apps don't tend to use a main function as often, in my own projects I've defined a separate entry point that I call at a certain point.

I tried to reproduce this in a more minimal example, where I defined my own static library which contained only a main function. I didn't see the error in this case. I'm not sure what it is about the Lua code that causes this.

Note that crashing the renderer process of an electron app can be tricky to debug because it detaches dev tools (meaning no visible error), and I wouldn't have had any idea what was happening without knowing to add ELECTRON_ENABLE_LOGGING=1 (log errors to stdout) before calling electron . .

Here is the runtime error:

my_exec.wasm:0x8ba55 Uncaught (in promise) RuntimeError: memory access out of bounds
    at my_exec.wasm:0x8ba55
    at my_exec.wasm:0x8b617
    at my_exec.wasm:0x44872
    at my_exec.wasm:0x44d60
    at my_exec.wasm:0x451cc
    at my_exec.wasm:0x452a0
    at my_exec.wasm:0x6471d
    at invoke_vii (my_exec.js:4980:29)
    at my_exec.wasm:0x425c8
    at my_exec.wasm:0x46148
$func1350 @ my_exec.wasm:0x8ba55
$func1348 @ my_exec.wasm:0x8b617
$func510 @ my_exec.wasm:0x44872
$func513 @ my_exec.wasm:0x44d60
$func515 @ my_exec.wasm:0x451cc
$func516 @ my_exec.wasm:0x452a0
$func850 @ my_exec.wasm:0x6471d
invoke_vii @ my_exec.js:4980
$func496 @ my_exec.wasm:0x425c8
$func527 @ my_exec.wasm:0x46148
$func849 @ my_exec.wasm:0x64487
$__main_argc_argv @ my_exec.wasm:0x8b3fe
(anonymous) @ my_exec.js:691
callMain @ my_exec.js:5295
doRun @ my_exec.js:5345
run @ my_exec.js:5360
runCaller @ my_exec.js:5280
removeRunDependency @ my_exec.js:629
receiveInstance @ my_exec.js:825
receiveInstantiationResult @ my_exec.js:843

Template

Please include the following in your bug report:

Version of emscripten/emsdk:

emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.55-git (165133b1cc977f0b9a277e42ef809b823157189c)
clang version 19.0.0git (/startdir/llvm-project 6c7805d5d186a6d1263f90b8033ad85e2d2633d7)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /opt/emscripten-llvm/bin

Failing command line in full: The command isn't failing. Link to build script, also attached: build_raw_lua.sh.txt

Full link command and output with -v appended:

Link to build output uploaded to github, also attached (output.log).

I tried embedding the raw text in the bug report, but I hit the limit of 65536 characters.

sbc100 commented 7 months ago

Since the lua main function is in liblua.a I would not expect and error (since the second main function should not be loaded out of the library file).

I'm currious about that is causing the crash. Is it choosing the wrong main function perhaps? Can you trying building with -g or with --profiling-funcs so you get a useful backtrace?

Does your main function and the main function in the lua library have the same signatures? i.e. to they both either take 2 args or zero args? We do have some very compilacated handling of the two different main flavors so that could be related.

alexbarry commented 7 months ago

Good catch, the two main functions did have separate signatures. When I changed mine to int main(int argc, char **argv) to match lua (instead of int main(void)), I didn't see the error anymore.

Below is what I see with -g (thanks for the pointer) and the different signatures that I had before (I also updated my demo site)

my_exec.wasm:0xa6dae Uncaught (in promise) RuntimeError: memory access out of bounds
    at my_exec.wasm.collectargs (my_exec.wasm:0xa6dae)
    at my_exec.wasm.pmain (my_exec.wasm:0xa67e4)
    at my_exec.wasm.precallC (my_exec.wasm:0x50cb5)
    at my_exec.wasm.luaD_precall (my_exec.wasm:0x51339)
    at my_exec.wasm.ccall (my_exec.wasm:0x51823)
    at my_exec.wasm.luaD_callnoyield (my_exec.wasm:0x51907)
    at my_exec.wasm.f_call (my_exec.wasm:0x77d72)
    at invoke_vii (my_exec.js:4980:29)
    at my_exec.wasm.luaD_rawrunprotected (my_exec.wasm:0x4e261)
    at my_exec.wasm.luaD_pcall (my_exec.wasm:0x52b1a)

I'll see if I can reproduce this without using Lua, instead defining my own small library with a different-signature main function defined.

alexbarry commented 7 months ago

I haven't been able to reproduce it with a more minimal example (defining my own library from scratch, rather than using Lua).

Digging into the Lua main function a bit more (now that I can see the stacktrace), it looks like the multiple mains part might be a red herring. I think Lua is simply traversing the argv params passed to main. I'll dig in a bit more.

alexbarry commented 7 months ago

Ah, sorry... it looks like the error is much more boring. Lua assumes that argv will not be null in this line: https://github.com/lua/lua/blob/v5.4/lua.c#L288

/*
** Traverses all arguments from 'argv', returning a mask with those
** needed before running any Lua code or an error code if it finds any
** invalid argument. In case of error, 'first' is the index of the bad
** argument.  Otherwise, 'first' is -1 if there is no program name,
** 0 if there is no script name, or the index of the script name.
*/
static int collectargs (char **argv, int *first) {
  int args = 0;
  int i;
  if (argv[0] != NULL) {  /* is there a program name? */

It works as expected if I change that to argv != NULL && argv[0] != NULL.

I'm not sure if there is a spec that says that argv should never be null or not.

alexbarry commented 7 months ago

According to this: https://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html

The file name of the program being run is also included in the vector as the first element; the value of argc counts this element. A null pointer always follows the last element: argv[argc] is this null pointer.

So to make people's lives a bit easier, emscripten could pass a program name as the first argv, or at least make argv[0] == NULL. And ideally emscripten could check for duplicate main definitions, there are some cases where clang (for non wasm builds)/gcc catch it but emcc does not.

But overall this has become more of a "feature request" than a "bug", I think. Thanks for your help.

sbc100 commented 7 months ago

Emscripten should never pass a null argv or even a null argv[0]. argv[0] should always be the program name.

I think the problem must stem from the fact that you have two both versions of main in the same program. Wasm struggles with functions like main that can have more than one signature so we have the handle it in special ways. I think your configuration must have confused either the linker or emscripten.

I think the simplest solution is simple to remove the main.c file from your liblua.a.. I'm fairly certain its not supposed to be in there anyway. Did you build it yourself?

alexbarry commented 7 months ago

Emscripten should never pass a null argv or even a null argv[0]. argv[0] should always be the program name.

I can confirm that in my latest minimal example, where I define int main(void) in my code while linking to an int main(int argc, char **argv) in liblua.a, the Lua main is passed argv equal to 0. But if I remove my main function, it is passed an argc of 1 and argv[0] is "./this.program", and argv[1] is NULL. I updated my demo and the source is available here.

I think the simplest solution is simple to remove the main.c file from your liblua.a.. I'm fairly certain its not supposed to be in there anyway.

Agreed, after looking at the Lua source more closely, lua.c is for the standalone interpreter, which my project doesn't need. I'll filter it out entirely. (Earlier I had looked for a macro or something to remove the main function only, I didn't realize I could remove the file entirely, since its name sounded important.)

I only raised this bug in the interest of saving other people from having to investigate something similar in the future.

Did you build it yourself?

I did, is there a better way to get wasm libraries? In my project I simply clone the lua repo, and use CMake to list its source files (and filter some out) and build them into a static library. I do something similar with zlib and libzip, and have done it for sqlite in the past.

sbc100 commented 7 months ago

Did you build it yourself?

I did, is there a better way to get wasm libraries? In my project I simply clone the lua repo, and use CMake to list its source files (and filter some out) and build them into a static library. I do something similar with zlib and libzip, and have done it for sqlite in the past.

That sounds pretty reasonable, but it leaves you vulnerable to this kind of issue (e.g. compiling sources files into a library that are not meant for the library). A better way might be to use lua's own build system to build the emscripten version (e.g. cmake or make?).

For zlib you don't need to build your own as we ship that with emscripten (use -sUSE_ZLIB). I

alexbarry commented 6 months ago

Oops, sorry I completely forgot about this.

Regarding the original issue: I'll update the title to reflect that the issue is that argv[0] is NULL and argc is 0 if you define one main of signature int main(void) and another that is int main(char* argv[], int argc). This happens to crash a specific version of Lua's main function (I think 5.4 and above). I can provide a minimal example of this if it helps.

Agreed about using the repo's build system, that would be ideal.

RE -sUSE_ZLIB, I was (in some cases) getting errors in CMake about it not being able to find zlib, despite setting this. I'm not actually sure how to tell CMake that emscripten includes its own zlib. This is likely just something that I could resolve if I read some more CMake docs, but it was easier in my case to just build zlib and set some CMake variables to tell CMake where the library and headers are.