emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.35k stars 3.25k forks source link

URL path generated for Worker imports doesn't work with Webpack #22140

Open Twinklebear opened 6 days ago

Twinklebear commented 6 days ago

Hi, I'm setting up a C++ WASM app to use threads that I distribute as part of a frontend app that uses webpack, however I'm running into an issue with how webpack is handling the URL for the worker import. In Emscripten's generated JS, the worker file is imported in allocateUnusedWorker as:

allocateUnusedWorker() {
  var worker;
  var workerOptions = {
    "type": "module",
    "name": "em-pthread"
  };
  worker = new Worker(new URL(import.meta.url), workerOptions);
  PThread.unusedWorkers.push(worker);
},

Webpack seems to resolve this import URL to a local file:// path, producing:

 allocateUnusedWorker() {
    var worker;
    var workerOptions = {
        "type": "module",
        "name": "em-pthread"
    };
    worker = new Worker(new URL("file:///Users/will/repos/webgpu-cpp-wasm/web/src/cpp/wgpu_app.js"),workerOptions);
    PThread.unusedWorkers.push(worker);
},

which results in the following error when trying to run the app, since the URL import path is not valid:

wgpu_app.js:1050 Uncaught (in promise) DOMException: Failed to construct 'Worker': Script at 'file:///Users/will/repos/webgpu-cpp-wasm/web/src/cpp/wgpu_app.js' cannot be accessed from origin 'http://localhost:8080'.
    at Object.allocateUnusedWorker (http://localhost:8080/6009193530ba34a86bac.js:4590:14)
    at Object.initMainThread (http://localhost:8080/6009193530ba34a86bac.js:4487:15)
    at Object.init (http://localhost:8080/6009193530ba34a86bac.js:4481:15)
    at http://localhost:8080/6009193530ba34a86bac.js:12723:9
    at http://localhost:8080/6009193530ba34a86bac.js:982:81

If I patch the function to change the import URL to (where script_filename.js is the name of the JS output file) webpack is able to resolve the path correctly:

allocateUnusedWorker() {
  var worker;
  var workerOptions = {
    "type": "module",
    "name": "em-pthread"
  };
  worker = new Worker(new URL("script_filename.js", import.meta.url).href, workerOptions);
  PThread.unusedWorkers.push(worker);
},

It also works without the .href, though webpack will then complain about circular dependencies between the files breaking filename hashing (since the file now depends on itself).

I noticed that this is the same URL structure that findWasmBinary returns to be compatible with bundlers: https://github.com/emscripten-core/emscripten/blob/d079eca24fa739e202b9145c7dcc8054b8165b77/src/preamble.js#L617 . Should the worker import be changed to something like below (or without the href?)

worker = new Worker('{{{ JS_FILE }}}', import.meta.url).href;

The full project code is here: https://github.com/Twinklebear/webgpu-cpp-wasm/tree/threads if you want to test it, the patch step can be removed by commenting out this line in src/CMakeLists.txt or just making the patch script a noop.

Or if I'm doing something wrong with my webpack config to import these files, let me know.

Version of emscripten/emsdk:

emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.61-git
sbc100 commented 5 days ago

Hmm.. it seems odd to me that new URL(import.meta.url) doesn't work as expected. shouldn't that be exactly the same as new URL('basename.js', import.meta.url)?

If webpack is expanding import.meta.url to "file:///Users/will/repos/webgpu-cpp-wasm/web/src/cpp/wgpu_app.js" then that would be wrong even in the later case above, no? Or is that expansion not happening when URL is constructed with two argument?

sbc100 commented 5 days ago

Regarding the self-dependency, do we really want to suppress that warning? I mean, the file really does depend on itself.. that is how it works.. we don't want webpack to thing any different do we?

sbc100 commented 5 days ago

@RReverser does new URL('{{{ JS_FILE }}}', import.meta.url) make sense for some reason over new URL(import.meta.url) when trying to get the URL of the current file?

Twinklebear commented 5 days ago

I hadn't found it in the docs earlier, but looking at webpack's docs on import.meta: https://webpack.js.org/blog/2020-10-10-webpack-5-release/#importmeta :

import.meta.url is the file: url of the current file (similar to __filename but as file url)

So it sounds like new URL(import.meta.url) translating to the file:// URL is expected behavior from webpack.

In the native worker support section they mention using the new Worker(new URL("file.js", import.meta.url)) syntax: https://webpack.js.org/blog/2020-10-10-webpack-5-release/#native-worker-support . It seems like Vite is similar: https://vitejs.dev/guide/assets#new-url-url-import-meta-url ? I'm not much of a web bundling expert though. So maybe using new URL('{{{ JS_FILE }}}', import.meta.url) is the right pattern for bundlers?

Omitting the .href and having the circular dependency is fine for me (it also sounds like it may cause Vite to not process things the way someone would want?). In our app we add the release version to our JS file names so they're unique whether webpack renames them or not.

sbc100 commented 5 days ago

I'm trying to reproduce the issue using webpack in our test harness, but I'm struggling to do so.

I've added a webpack test that includes EXPORT_ES6 here: https://github.com/emscripten-core/emscripten/pull/22142

However, I can't seem to make it work with pthreads, with or without EXPORT_ES6.

RReverser commented 5 days ago

@RReverser does new URL('{{{ JS_FILE }}}', import.meta.url) make sense for some reason over new URL(import.meta.url) when trying to get the URL of the current file?

Yeah the former is recognised as a pattern by a lot of bundlers, but the latter isn't, even though in runtime they'd be the same, yes.

sbc100 commented 5 days ago

@RReverser does new URL('{{{ JS_FILE }}}', import.meta.url) make sense for some reason over new URL(import.meta.url) when trying to get the URL of the current file?

Yeah the former is recognised as a pattern by a lot of bundlers, but the latter isn't, even though in runtime they'd be the same, yes.

Damn.. thats kind of a shame. IMHO, the later is much cleaner since its independent of that actual filename.

The former requires that extra tempting step, more characters, and is not portable across file renames.

dr-matt commented 2 days ago

I've hit this issue as well after recently updating emsdk. What is the disposition here? Is there a workaround that doesn't involve manual patching? edit: reverting to v3.1.57 is one workaround - that seems to be the last version before this change in url construction.

sbc100 commented 2 days ago

I'd be happy to see a PR submitted to fix this, and hopefully it can come with and update to test_webpack_es6 that reproduces this failure and shows that its fixes.

Twinklebear commented 1 day ago

The issue may have been missing the -sPTHREAD_POOL_SIZE=1 (or otherwise creating a worker to run a thread in the C code), since without that the allocateUnusedWorker function isn't called. I've opened a PR to fix the generated JS code and update the test: #22165