emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.91k stars 3.32k forks source link

[question] How to await all run dependencies (data-files) to be satisfied? #14362

Open vadimkantorov opened 3 years ago

vadimkantorov commented 3 years ago

I'm using emscripten in a Worker. I'm loading data packages manually (by doing importScripts on the data package JS files with unfortunate necessary hacks around static variables https://github.com/emscripten-core/emscripten/issues/12347), and then calling main manually, so I'd like to await until all the data-files are loaded / run dependencies met.

Is there a way to achieve that?

Currently I have two processes independently: the program executes (and fails, because required files are not present in FS) and also complains that it still waits for dependencies to be satisfied.

Thanks!

kripken commented 3 years ago

What do you mean by "data package"?

If you mean something created by tools/file_packager.py then things should just work (if you load the package scripts first).

If you mean something else, then you can call addRunDependency / removeRunDependency manually. That is the internal system that is used to wait for things before startup. That is, call add when you start to import something, and call remove after it has arrived. The last call to remove will call run.

vadimkantorov commented 3 years ago

Yes, I mean js/data files produced by file_packager. In my case, it doesn't wait for some reason - maybe because I sidestep the standard startup code in order to get more control over calling the main function. So I'd like to do it manually.

i.e. I don't use run or callMain because of problems with https://github.com/emscripten-core/emscripten/issues/12219

vadimkantorov commented 3 years ago

I've tried sth like pseudocode below, but it didn't work, seems Module.run does not get called and the promise hangs:

Module.calledRun = false;
const dependencies_fullfilled = new Promise(resolve => (Module.run = resolve));
importScripts('my_data_package.js')
Module.preRun(); // pseudo-code. I do call the preRun function added by importScripts
await dependencies_filled;
vadimkantorov commented 3 years ago

It seems that adding data packages after the initial run isn't supported. It would be good to have it as first-class feature.

Currently it's not even possible to hack around, because there is no way to reset the calledRun variable (NB: not Module.calledRun)

vadimkantorov commented 3 years ago

Basically, I need a flexible, controllabe mechanism of loading data pacakges, independently of the module, so that I could re-execute runWithFs when needed and await for its completion.

Ideally, it should be sth like

const data_package_promise = load_data_package('my_data_package.js');
...
const data_package = await data_package_promise;
console.log('Contained file paths:', data_package.file_paths);
await data_package.create_preloaded_files(Module);

At the very least, Module should expose dependenciesFullfilled event

cc @kripken

kripken commented 3 years ago

For things after the initial run, there was a concept of "blockers" at some point. I think it would block the main loop from continuing. I think that is still present in the code (can look in the code for emscripten_main_loop etc.) but it is a very specific mechanism, and different codebases might need different things. I assume most users write their own async system for handling deps and so forth, based on their specific needs.

vadimkantorov commented 3 years ago

I'm not talking about "stopping" the running module, just of having explicit events/promises per every loaded data package - that would allow awaiting them by the client easily.

Anyway, it seems that package loading would benefit from more transparent promise-based solution. And maybe could help debug the deadlocks as well.

kripken commented 3 years ago

Good point. Yes, if it was written today it would be Promise-based, I agree. It would be nice if someone had time and interest to look into that!

vadimkantorov commented 3 years ago

If IDBFS helper methods (and maybe other helper methods) were instead in the main JS library, then it would be easier to start this refactoring, since on can even parse the produced offsets / file list.

Another suggestion would be for file_package.py to also produce a json or a binary file describing the offsets and file list. Then hacking together a custom package loader is a much more pleasant task