emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.77k stars 3.3k forks source link

STANDALONE_WASM breaks Emscripten exceptions #18253

Closed clauverjat closed 1 year ago

clauverjat commented 1 year ago

Hello

It looks like adding the -s STANDALONE_WASM option breaks the JS-based Emscripten exception support (enabled with -fexceptions). With the standalone flag, it looks like throwing an exception aborts the program execution instead of handling the exception properly.

Note: I understand that Emscripten has support for both native "experimental" wasm exception (via -fwasm-exceptions) and a JavaScript-based support (which is enabled via -fexceptions). At first sight it might look a bit peculiar that I use "-s STANDALONE_WASM" with "-fexceptions" and not "-s STANDALONE_WASM" with "-fwasm-exceptions". This is because I target a wasm runtime without native Exception Handling support. So I use "-s STANDALONE_WASM" to maximize the use of WASI interfaces but I still need the JS based exception handling (thus the "-fexceptions" flag). I hope this makes sense šŸ˜‰

Version of emscripten/emsdk:

$ emcc -v
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.25 (febd44b21ecaca86e2cb2a25ef3ed4a0a2076365)
clang version 16.0.0 (https://github.com/llvm/llvm-project effd75bda4b1a9b26e554c1cda3e3b4c72fa0aa8)
Target: wasm32-unknown-emscripten
Thread model: posix

How to reproduce the bug ?

I created a simple C++ file to test exception support. Let's save it as main.cpp

#include <iostream>
#include <stdexcept>

int main() {
    std::cout << "enter main" << std::endl;
    try {
        std::cout << "enter try" << std::endl;
        throw std::runtime_error("");
        std::cout << "illegal" << std::endl;
    } catch (std::runtime_error const& ex) {
        std::cout << "catching runtime_error" << std::endl;
    }
    std::cout << "exit main" << std::endl;
}

If I compile the code with em++ -sWASM_BIGINT -fexceptions -O1 src/main.cpp -o main.html. I get the following when I run the code (which is the expected output) :

enter main
enter try
catching runtime_error
exit main

but if I add the -s STANDALONE_WASM compile flag : em++ -s STANDALONE_WASM -sWASM_BIGINT -fexceptions -O1 src/main.cpp -o main.html the output changes and I only got :

enter main
enter try

Full link command and output with -v appended:

em++ -v -sWASM_BIGINT -fexceptions -O1 src/main.cpp -o main.html

 "/workspaces/my_project/emsdk/upstream/bin/clang" --version
 "/workspaces/my_project/emsdk/upstream/bin/clang++" -target wasm32-unknown-emscripten -fvisibility=default -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-cxx-exceptions -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr -DEMSCRIPTEN -I/workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/SDL --sysroot=/workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot -Xclang -iwithsysroot/include/compat -v -fexceptions -O1 src/main.cpp -c -o /tmp/emscripten_temp_c176n3xf/main_0.o
clang version 16.0.0 (https://github.com/llvm/llvm-project effd75bda4b1a9b26e554c1cda3e3b4c72fa0aa8)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /workspaces/my_project/emsdk/upstream/bin
 (in-process)
 "/workspaces/my_project/emsdk/upstream/bin/clang-16" -cc1 -triple wasm32-unknown-emscripten -emit-obj --mrelax-relocations -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name main.cpp -mrelocation-model static -mframe-pointer=none -ffp-contract=on -fno-rounding-math -mconstructor-aliases -target-cpu generic -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -v -fcoverage-compilation-dir=/workspaces/my_project/wasmer -resource-dir /workspaces/my_project/emsdk/upstream/lib/clang/16.0.0 -D EMSCRIPTEN -I /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/SDL -isysroot /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot -internal-isystem /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten/c++/v1 -internal-isystem /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/c++/v1 -internal-isystem /workspaces/my_project/emsdk/upstream/lib/clang/16.0.0/include -internal-isystem /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten -internal-isystem /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include -O1 -fdeprecated-macro -fdebug-compilation-dir=/workspaces/my_project/wasmer -ferror-limit 19 -fvisibility=default -fgnuc-version=4.2.1 -fcxx-exceptions -fexceptions -fcolor-diagnostics -iwithsysroot/include/compat -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-cxx-exceptions -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr -o /tmp/emscripten_temp_c176n3xf/main_0.o -x c++ src/main.cpp
clang -cc1 version 16.0.0 based upon LLVM 16.0.0git default target x86_64-unknown-linux-gnu
ignoring nonexistent directory "/workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten/c++/v1"
ignoring nonexistent directory "/workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten"
#include "..." search starts here:
#include <...> search starts here:
 /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/SDL
 /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/compat
 /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/c++/v1
 /workspaces/my_project/emsdk/upstream/lib/clang/16.0.0/include
 /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include
End of search list.
 "/workspaces/my_project/emsdk/upstream/bin/wasm-ld" -o main.wasm /tmp/emscripten_temp_c176n3xf/main_0.o -L/workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten -lGL -lal -lhtml5 -lstubs -lnoexit -lc -ldlmalloc -lcompiler_rt -lc++ -lc++abi -lsockets -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-cxx-exceptions -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr --import-undefined --strip-debug --export-if-defined=main --export-if-defined=__start_em_asm --export-if-defined=__stop_em_asm --export-if-defined=__start_em_lib_deps --export-if-defined=__stop_em_lib_deps --export-if-defined=__start_em_js --export-if-defined=__stop_em_js --export-if-defined=__main_argc_argv --export=stackSave --export=stackRestore --export=stackAlloc --export=__wasm_call_ctors --export=__errno_location --export=getTempRet0 --export=setTempRet0 --export=malloc --export=free --export=__cxa_is_pointer_type --export=__cxa_can_catch --export=setThrew --export-table -z stack-size=5242880 --initial-memory=16777216 --no-entry --max-memory=16777216 --global-base=1024
 "/workspaces/my_project/emsdk/node/14.18.2_64bit/bin/node" /workspaces/my_project/emsdk/upstream/emscripten/src/compiler.js /tmp/tmpiuf36wg2.json
 "/workspaces/my_project/emsdk/upstream/bin/llvm-objcopy" main.wasm main.wasm --remove-section=.debug* --remove-section=producers
 "/workspaces/my_project/emsdk/node/14.18.2_64bit/bin/node" /workspaces/my_project/emsdk/upstream/emscripten/tools/preprocessor.js /tmp/emscripten_temp_c176n3xf/settings.js shell.html
 "/workspaces/my_project/emsdk/node/14.18.2_64bit/bin/node" /workspaces/my_project/emsdk/upstream/emscripten/node_modules/.bin/html-minifier-terser main.html -o main.html --collapse-whitespace --collapse-inline-tag-whitespace --remove-comments --remove-tag-whitespace --sort-attributes --sort-class-name --decode-entities --collapse-boolean-attributes --remove-attribute-quotes --remove-redundant-attributes --remove-script-type-attributes --remove-style-link-type-attributes --use-short-doctype --minify-css true --minify-js true

em++ -v -s STANDALONE_WASM -sWASM_BIGINT -fexceptions -O1 src/main.cpp -o main.html

 "/workspaces/my_project/emsdk/upstream/bin/clang" --version
 "/workspaces/my_project/emsdk/upstream/bin/clang++" -target wasm32-unknown-emscripten -fvisibility=default -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-cxx-exceptions -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr -DEMSCRIPTEN -I/workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/SDL --sysroot=/workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot -Xclang -iwithsysroot/include/compat -v -fexceptions -O1 src/main.cpp -c -o /tmp/emscripten_temp_x8mlw540/main_0.o
clang version 16.0.0 (https://github.com/llvm/llvm-project effd75bda4b1a9b26e554c1cda3e3b4c72fa0aa8)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /workspaces/my_project/emsdk/upstream/bin
 (in-process)
 "/workspaces/my_project/emsdk/upstream/bin/clang-16" -cc1 -triple wasm32-unknown-emscripten -emit-obj --mrelax-relocations -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name main.cpp -mrelocation-model static -mframe-pointer=none -ffp-contract=on -fno-rounding-math -mconstructor-aliases -target-cpu generic -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -v -fcoverage-compilation-dir=/workspaces/my_project/wasmer -resource-dir /workspaces/my_project/emsdk/upstream/lib/clang/16.0.0 -D EMSCRIPTEN -I /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/SDL -isysroot /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot -internal-isystem /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten/c++/v1 -internal-isystem /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/c++/v1 -internal-isystem /workspaces/my_project/emsdk/upstream/lib/clang/16.0.0/include -internal-isystem /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten -internal-isystem /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include -O1 -fdeprecated-macro -fdebug-compilation-dir=/workspaces/my_project/wasmer -ferror-limit 19 -fvisibility=default -fgnuc-version=4.2.1 -fcxx-exceptions -fexceptions -fcolor-diagnostics -iwithsysroot/include/compat -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-cxx-exceptions -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr -o /tmp/emscripten_temp_x8mlw540/main_0.o -x c++ src/main.cpp
clang -cc1 version 16.0.0 based upon LLVM 16.0.0git default target x86_64-unknown-linux-gnu
ignoring nonexistent directory "/workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten/c++/v1"
ignoring nonexistent directory "/workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten"
#include "..." search starts here:
#include <...> search starts here:
 /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/SDL
 /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/compat
 /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include/c++/v1
 /workspaces/my_project/emsdk/upstream/lib/clang/16.0.0/include
 /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/include
End of search list.
 "/workspaces/my_project/emsdk/upstream/bin/wasm-ld" -o main.wasm /tmp/emscripten_temp_x8mlw540/main_0.o -L/workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten /workspaces/my_project/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten/crt1.o -lGL -lal -lhtml5 -lstandalonewasm -lstubs -lc -ldlmalloc -lcompiler_rt -lc++ -lc++abi -lsockets -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-cxx-exceptions -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr --import-undefined --strip-debug --export-if-defined=__start_em_asm --export-if-defined=__stop_em_asm --export-if-defined=__start_em_lib_deps --export-if-defined=__stop_em_lib_deps --export-if-defined=__start_em_js --export-if-defined=__stop_em_js --export=stackSave --export=stackRestore --export=stackAlloc --export=__errno_location --export=getTempRet0 --export=setTempRet0 --export=malloc --export=free --export=__cxa_is_pointer_type --export=__cxa_can_catch --export=setThrew --export-table -z stack-size=5242880 --initial-memory=16777216 --max-memory=16777216 --global-base=1024
 "/workspaces/my_project/emsdk/node/14.18.2_64bit/bin/node" /workspaces/my_project/emsdk/upstream/emscripten/src/compiler.js /tmp/tmpwmllcz6m.json
 "/workspaces/my_project/emsdk/upstream/bin/llvm-objcopy" main.wasm main.wasm --remove-section=.debug* --remove-section=producers
 "/workspaces/my_project/emsdk/node/14.18.2_64bit/bin/node" /workspaces/my_project/emsdk/upstream/emscripten/tools/preprocessor.js /tmp/emscripten_temp_x8mlw540/settings.js shell.html
 "/workspaces/my_project/emsdk/node/14.18.2_64bit/bin/node" /workspaces/my_project/emsdk/upstream/emscripten/node_modules/.bin/html-minifier-terser main.html -o main.html --collapse-whitespace --collapse-inline-tag-whitespace --remove-comments --remove-tag-whitespace --sort-attributes --sort-class-name --decode-entities --collapse-boolean-attributes --remove-attribute-quotes --remove-redundant-attributes --remove-script-type-attributes --remove-style-link-type-attributes --use-short-doctype --minify-css true --minify-js true

Thanks

sbc100 commented 1 year ago

What you are trying to do is somewhat contradictory, since emscripten exceptions depends on host JS code, and therefore fundamentally not compatible with being standalone.

We could probably try to make this work, and if you would like to submit a PR to make it work it would likely be accepted, but I wonder if its worth the effort? Is there some reason why you cannot use the emscripten-generated JS, and therefore not need to worry about STANDALONE mode?

clauverjat commented 1 year ago

What you are trying to do is somewhat contradictory, since emscripten exceptions depends on host JS code, and therefore fundamentally not compatible with being standalone.

I figured as much that's why I added a quick note on my use case in my first message. But I'll give more context below to motivate why it would still be useful to support having both options at the same time.

Is there some reason why you cannot use the emscripten-generated JS, and therefore not need to worry about STANDALONE mode?

I am running WebAssembly from Python via Wasmtime, so I don't have a JS runtime at my disposal, unfortunately.

We could probably try to make this work, and if you would like to submit a PR to make it work it would likely be accepted, but I wonder if its worth the effort?

First, let me give you a bit more context into what I am trying to achieve. My end goal is to compile a C++ program to WebAssembly, and then embed it into a Python package. I use Wasmtime to run WebAssembly from Python, but Wasmtime does not implement the Exception Handling proposal yet (and might not support it for a while, see https://github.com/bytecodealliance/wasmtime/issues/3427). I also considered Wasmer but they haven't implemented the EH proposal either. My C++ program requires exceptions to work correctly, so disabling exceptions is not an option. So in the meantime, I have to find a way to deal with exceptions.

My plan to circumvent the limitation of those runtimes is to implement exceptions using Python code in the same way Emscripten uses JS to work around the lack of support for Wasm-EH. To this end, I have already ported the parts of the Emscripten JS glue code responsible for the exception support (i.e. the invoke_... and cxa_... functions). With that, I was able to run my C++ sample code with a reasonable amount of work, so I am confident that my plan can work.

That being said, without STANDALONE mode, the Emscripten compiler outputs wasm files that rely on a lot of other JS functions when compiling "real code". This was not an issue for my sample code because it is small, but porting all the JS glue would require a huge work and it will not be worth it. Thus to minimize the amount of code to be ported, it'll be great if STANDALONE and "JS-based exception" could be used together.

Moreover, I did some research and it looks like @rygo6 had a similar goal to mine when he opened https://github.com/emscripten-core/emscripten/issues/15894. So I think other people might benefit if Emscripten supported it.

We could probably try to make this work, and if you would like to submit a PR to make it work it would likely be accepted

After a bit of debugging I found why the program aborted when throwing an exception, and found a way to make it work. I will submit a PR with those changes.

Thanks

mhils commented 1 year ago

@clauverjat: Would you mind sharing your Python implementation of the invoke... and cxa... functions? I'm on a similar path here.

clauverjat commented 1 year ago

@mhils Sure, I'll contact you by email to see how we can arrange that.