emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.86k stars 3.32k forks source link

With a pthread build, and bundled with webpack, creation of new Web Worker fails due to protocol=file: in import.meta.url #22521

Open badgermole opened 2 months ago

badgermole commented 2 months ago

Version of emscripten/emsdk: 3.1.64


Please include the output emcc -v here:

"/emsdk/upstream/bin/clang++" -target wasm32-unknown-emscripten -mllvm -combiner-global-alias-analysis=false -mllvm -wasm-enable-sjlj -mllvm -disable-lsr --sysroot=/emsdk/upstream/emscripten/cache/sysroot -D__EMSCRIPTEN_SHARED_MEMORY__=1 -DEMSCRIPTEN -Xclang -iwithsysroot/include/fakesdl -Xclang -iwithsysroot/include/compat -std=c++20 -D__linux -Wno-backslash-newline-escape -Wno-bitwise-op-parentheses -Wno-deprecated-register -Wno-inconsistent-missing-override -Wno-logical-op-parentheses -Wmissing-declarations -Wdefaulted-function-deleted -Wno-unused-value -g3 -fwasm-exceptions -v -pthread -D__WASM32__ -I. -c -matomics -mbulk-memory MyAPI.cpp -o build_pt/dbg/MyAPI.o
clang version 19.0.0git (https:/github.com/llvm/llvm-project 4d8e42ea6a89c73f90941fd1b6e899912e31dd34)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /emsdk/upstream/bin
 (in-process)
 "/emsdk/upstream/bin/clang-19" -cc1 -triple wasm32-unknown-emscripten -emit-obj -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name MyAPI.cpp -mrelocation-model static -mframe-pointer=none -ffp-contract=on -fno-rounding-math -mconstructor-aliases -target-feature +atomics -target-feature +bulk-memory -target-feature +mutable-globals -target-feature +sign-ext -target-feature +exception-handling -mllvm -wasm-enable-eh -target-feature +multivalue -target-feature +reference-types -target-feature +exception-handling -exception-model=wasm -target-feature +multivalue -target-feature +reference-types -target-cpu generic -target-feature +atomics -target-feature +bulk-memory -fvisibility=hidden -debug-info-kind=constructor -dwarf-version=4 -debugger-tuning=gdb -fdebug-compilation-dir=/mnt/wasm_export -v -fcoverage-compilation-dir=/mnt/wasm_export -resource-dir /emsdk/upstream/lib/clang/19 -D __EMSCRIPTEN_SHARED_MEMORY__=1 -D EMSCRIPTEN -D __linux -D __WASM32__ -I . -isysroot /emsdk/upstream/emscripten/cache/sysroot -internal-isystem /emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten/c++/v1 -internal-isystem /emsdk/upstream/emscripten/cache/sysroot/include/c++/v1 -internal-isystem /emsdk/upstream/lib/clang/19/include -internal-isystem /emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten -internal-isystem /emsdk/upstream/emscripten/cache/sysroot/include -Wno-backslash-newline-escape -Wno-bitwise-op-parentheses -Wno-deprecated-register -Wno-inconsistent-missing-override -Wno-logical-op-parentheses -Wmissing-declarations -Wdefaulted-function-deleted -Wno-unused-value -std=c++20 -fdeprecated-macro -ferror-limit 19 -pthread -fgnuc-version=4.2.1 -fno-implicit-modules -fskip-odr-check-in-gmf -fcxx-exceptions -fexceptions -exception-model=wasm -fcolor-diagnostics -iwithsysroot/include/fakesdl -iwithsysroot/include/compat -mllvm -combiner-global-alias-analysis=false -mllvm -wasm-enable-sjlj -mllvm -disable-lsr -o build_pt/dbg/MyAPI.o -x c++ MyAPI.cpp
clang -cc1 version 19.0.0git based upon LLVM 19.0.0git default target x86_64-unknown-linux-gnu
ignoring nonexistent directory "/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten/c++/v1"
ignoring nonexistent directory "/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten"
#include "..." search starts here:
#include <...> search starts here:
 .
 /emsdk/upstream/emscripten/cache/sysroot/include/fakesdl
 /emsdk/upstream/emscripten/cache/sysroot/include/compat
 /emsdk/upstream/emscripten/cache/sysroot/include/c++/v1
 /emsdk/upstream/lib/clang/19/include
 /emsdk/upstream/emscripten/cache/sysroot/include
End of search list.

Failing command line in full: N/A

Full link command and output with -v appended:

 /emsdk/upstream/bin/clang --version
emcc: warning: -pthread + ALLOW_MEMORY_GROWTH may run non-wasm code slowly, see https://github.com/WebAssembly/design/issues/1271 [-Wpthreads-mem-growth]
 /emsdk/upstream/bin/wasm-ld -o ./bin_pt/dbg/my.wasm ./build_pt/dbg/MyAPI.o -lembind-rtti -L/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten /emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten/crtbegin.o -lGL-mt-webgl2-full_es3-getprocaddr -lal -lhtml5 -lbulkmemory -lstubs-debug -lnoexit -lc-mt-debug -ldlmalloc-mt-debug -lcompiler_rt-wasm-sjlj-mt -lc++-mt-except -lc++abi-debug-mt-except -lunwind-mt-except -lsockets-mt -lwasmfs_no_fs -lwasmfs-mt-debug -mllvm -combiner-global-alias-analysis=false -mllvm -wasm-enable-sjlj -mllvm -disable-lsr -mllvm -wasm-enable-eh -mllvm -exception-model=wasm /tmp/tmp0glxjktdlibemscripten_js_symbols.so --import-memory --shared-memory --export=sbrk --export=emscripten_stack_get_end --export=emscripten_stack_get_free --export=emscripten_stack_get_base --export=emscripten_stack_get_current --export=emscripten_stack_init --export=wasmfs_flush --export=_emscripten_stack_alloc --export=emscripten_get_sbrk_ptr --export=__getTypeName --export=_emscripten_thread_free_data --export=_emscripten_thread_crashed --export=emscripten_main_runtime_thread_id --export=emscripten_main_thread_process_queued_calls --export=_emscripten_run_on_main_thread_js --export=emscripten_stack_set_limits --export=_embind_initialize_bindings --export=__get_temp_ret --export=__set_temp_ret --export=__trap --export=__cpp_exception --export=__wasm_call_ctors --export=_emscripten_tls_init --export=_emscripten_thread_init --export=_emscripten_stack_restore --export=_emscripten_thread_exit --export=malloc --export=__get_exception_message --export=free --export=__thrown_object_from_unwind_exception --export=__cxa_increment_exception_refcount --export=__cxa_decrement_exception_refcount --export-if-defined=__start_em_asm --export-if-defined=__stop_em_asm --export-if-defined=__start_em_lib_deps --export-if-defined=__stop_em_lib_deps --export-if-defined=__start_em_js --export-if-defined=__stop_em_js --export-if-defined=main --export-if-defined=__main_argc_argv --export-if-defined=fflush --export-table --growable-table -z stack-size=65536 --max-memory=4294967296 --initial-memory=1073741824 --no-entry --stack-first --table-base=1
 /emsdk/upstream/bin/llvm-objcopy ./bin_pt/dbg/my.wasm ./bin_pt/dbg/my.wasm --remove-section=producers
 /emsdk/upstream/bin/wasm-emscripten-finalize -g --dyncalls-i64 --pass-arg=legalize-js-interface-exported-helpers --dwarf ./bin_pt/dbg/my.wasm -o ./bin_pt/dbg/my.wasm --detect-features
 /emsdk/node/18.20.3_64bit/bin/node /emsdk/upstream/emscripten/src/compiler.mjs /tmp/tmp147_ng6m.json
warning: JS library symbol '$ALLOC_NORMAL' is deprecated. Please open a bug if you have a continuing need for this symbol [-Wdeprecated]
warning: JS library symbol '$ALLOC_STACK' is deprecated. Please open a bug if you have a continuing need for this symbol [-Wdeprecated]
warning: JS library symbol '$allocate' is deprecated. Please open a bug if you have a continuing need for this symbol [-Wdeprecated]
emcc: warning: warnings in JS library compilation [-Wjs-compiler]
emcc: warning: running limited binaryen optimizations because DWARF info requested (or indirectly required) [-Wlimited-postlink-optimizations]
 /emsdk/upstream/bin/wasm-opt --safe-heap ./bin_pt/dbg/my.wasm -o ./bin_pt/dbg/my.wasm -g --mvp-features --enable-threads --enable-bulk-memory --enable-exception-handling --enable-multivalue --enable-mutable-globals --enable-reference-types --enable-sign-ext
 /emsdk/node/18.20.3_64bit/bin/node /emsdk/upstream/emscripten/tools/acorn-optimizer.mjs /tmp/emscripten_temp_2vamddqk/my.js unsignPointers --closure-friendly --export-es6 -o /tmp/emscripten_temp_2vamddqk/my.jso1.js
 /emsdk/node/18.20.3_64bit/bin/node /emsdk/upstream/emscripten/tools/acorn-optimizer.mjs /tmp/emscripten_temp_2vamddqk/my.jso1.js growableHeap --closure-friendly --export-es6 -o /tmp/emscripten_temp_2vamddqk/my.jso2.js
 /emsdk/node/18.20.3_64bit/bin/node /emsdk/upstream/emscripten/tools/acorn-optimizer.mjs /tmp/emscripten_temp_2vamddqk/my.jso1.js.pgrow.js safeHeap --closure-friendly --export-es6 -o /tmp/emscripten_temp_2vamddqk/my.jso1.js.jso3.js</code>

Issue: With a pthread build, and bundled with webpack, creation of new Web Worker fails due to protocol=file: in import.meta.url.

The short description is that I have a project (project-A) where we are compiling our WASM target with -pthread. Project-A is bundled with webpack. Project-A is in turn used by project-B, which also uses webpack as a bundler, so the bundling is nested. In the Module JS file, the line worker = new Worker(new URL("my.js", import.meta.url), workerOptions); fails because the protocol of import.meta.url is ‘file:’.

I have not been able to find a webpack-based way to work through this issue. The workaround I have involves hand-editing my.js to do the following (minus error checking here): worker = new Worker(Module["wasmJsURL"].href, workerOptions);, where wasmJsURLis ascertained by the main application since it understands the location.origin and path, and it can send the complete URL during instantiation of the WASM JS Module object new MyAppWASM( { instantiateWasm: onInstantiateWasm, noExitRuntime: true, wasmJsURL: this.wasmJsURL } )….

Might there be another approach that I’m not aware of? I can put a minimal example in a github project if needed. Or is it a valid request to have Emscripten offer the ability for passing the WASM JS URL during instantiation for use in Worker creation?

Here are a few references I've collected that have, or may have, bearing:

22140

20580

#16878

sbc100 commented 2 months ago

Is worker = new Worker(new URL("my.js", import.meta.url), workerOptions) a problem on its own? Of is it only when doing nested bundling? i.e. is webpack re-writing this line to be something else in the bundled output?

The worker = new Worker(new URL("my.js", import.meta.url), workerOptions) is, I believe, OK for webpack. I fact we recently fixed an issue with that line and added test for it in #22165

badgermole commented 2 months ago

I was thrilled when I saw #22165 because I was having the same issue, and that did resolve it in that case. So a qualified 'yes' to does #22165 fix the non-nested case. However, it's still an issue with the project I'm working on now, even when not doing nested bundling as it turns out. What does seem to work most for me is passing down the URL from the application when instantiating the module object.

sbc100 commented 2 months ago

If you do new Worker(Module["wasmJsURL"].href, workerOptions); then won't it be impossible for webpack to see the name of the file you are trying to load and therefore make it impossible for webpack to "bundle" said file? Itsn't the reason webpack is trying to parse the call to new Worker so that is can bundle the worker file and re-write the new Worker call appropriately?

@RReverser any insights here? I'm under understanding things right?

RReverser commented 2 months ago

@RReverser any insights here? I'm under understanding things right?

Sounds right. Webpack needs specifically the new Worker(new URL('...', import.meta.url)) pattern.

badgermole commented 2 months ago

You're right. That and the file: protocol getting used for import.meta.url led me to exclude my.js from the webpack bundle, putting it in the same directory as my.wasm. But that only works if I can pass down the URL so it can be picked up where the worker is created. At this point I can say, for the non-nested case, new Worker(new URL('...', import.meta.url)) does work. But for the nested case, I still haven't seen my way through. I get:

my.js:1582 import.meta.url:  file:///D:/.../wasm_export/bin_pt/dbg/my.js
my.js:1583 Uncaught (in promise) SecurityError: Failed to construct 'Worker': Script at 'file:///D:/.../wasm_import/mywasm/bin_pt_dbg_my_js.MyWasm.min.js' cannot be accessed from origin 'http://localhost:8088'.
    at Object.allocateUnusedWorker (my.js:1583:14)

I've put together as minimal an example as I could into repo emsdk_pthread_nested_webpack if anyone has time to take a look.

Another interesting note is that webpack duplicates my.js. It puts it once in a hash-named file, and once in MyWasm.min.js. And it also gives a circular dependency warning. However, if I exchange: worker = new Worker(new URL("my.js", import.meta.url), workerOptions);
for
var my_js = "my.js";
worker = new Worker(new URL(my_js, import.meta.url), workerOptions); it doesn't do that, but puts it only in MyWasm.min.js.

sbc100 commented 2 months ago

Right, I guess when you do worker = new Worker(new URL(my_js, import.meta.url), workerOptions); then webpack doesn't know what file you are trying to load so it can't actually bundle said file and it gives up. You are effectively defeating the bundling process I believe that just falling back to using your original unbundled "my.js" file.

Not that in both cases don't you end up two seprate copies of the main js file? You either end up using a hash-named file or you end up using your original my.js that emscripten generated? Is that right?

This seems unfortunate and kind of defeats the point of bundles doesn't it? i.e. without bundling you just have a single my.js file, but with bundling that code in my.js exists in two different files that must be downloaded to the client separately. Is the right @RReverser ?

badgermole commented 1 month ago

I apologize for neglecting this - I had to switch tasks for a bit. I appreciate your comments and questions up to this point.
I should also try opening an issue in webpack's repo since I'm not convinced there isn't a webpack-based solution. But at the moment, I've worked around by post-processing to change worker = new Worker(new URL("my.js", import.meta.url), workerOptions) to

if ( Module[ "wasmJsURL" ] ) {
    worker = new Worker( new URL( Module[ "wasmJsURL" ].href ), workerOptions );
}
else {
    var jsFile = "my.js";
    worker = new Worker( new URL( jsFile, import.meta.url ), workerOptions );
}

@sbc100, like you said, I end up with two copies, one bundled and one not. So I'll still be probing for a real solution. But it does get me past this for the moment.
I'm not sure what your policy is for closing issues. I don't really consider this closed, but it's seeming like it may drag on a bit given the time I can spend on it.🫤