emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.76k stars 3.3k forks source link

emscripten objects/modules deletion/unloading #18080

Open neatudarius opened 2 years ago

neatudarius commented 2 years ago

Hello guys! Can you help me to check how can I track the lifetime of an object loaded in an emscripten app (TS + C++)?

Context: I want to implement a feature where I am trying to reload the app in the same Worker (without killing it). I actually need to destroy all objects inside the this worker. Destroy here can mean to make sure the object is Garbage Collected OR to manually destroys it (if needed - check 3). Basically, I want to implement a shutdown mechanism which is similar with releasing all memory (TS + C++) and check I don't have leaks, like in the C++ world :D.

I found few types of objects in my project:

  1. “standard” objects: objects created in TS (not C++ related - e.g. no bindings at all) are automatically freed by the GC. This also includes objects created and kept inside C++ using emscpriten::val. Basically 1 is handled by GC.

  2. “shared types/objects” + stack allocation: object allocated in C++ on stack and returned to JS. They are automatically GC-ed (be like standard JS objects).

  3. “shared types/objects” + heap allocation: e.g. type defined in C++, bind into a TS/JS interface. If in TS I end up using dynamic allocation (a.k.a. new $interface_from_cpp()), I must manually call $object.delete() in JS (check docs: It is strongly recommended that JavaScript code explicitly deletes any C++ object handles it has received. The delete() JavaScript method is provided to manually signal that a C++ object is no longer needed and can be deleted. Both C++ objects constructed from the JavaScript side as well as those returned from C++ methods must be explicitly deleted.).

  4. EmscriptenModule-like objects, which from my understanding, will be actually used to load the WASM file intro the browser (?). The lifetime of the EmscriptenModule object is the same as for WASM? (e.g. by releasing all references to the module, will automatically unload the WASM?)

  5. bindings- by actually using EMSCRIPTEN_BINDINGS it seems the runtime creates some global variables (emscripten internal stuff) . They seem to be released when calling atexit ( if this exists (e.g. in a C++ main() -based problem) : That means that when main() exits, we don’t flush the stdio streams, or call the destructors of global C++ objects, or call atexit callbacks). In my case, I don't have a main() function. I am running in a request frame animation based loop. How can I trigger destroying the actual runtime / bindings?

Q1: Do you know about any object which is not in {1, 2, 3, 4, 5}?

Q2: If 4 is true (destroy module => unload WASM), how can I test this? I don’t really know how to test that works (e.g. check WASM is not present anymore? OR that my operation is a NOP - the module object was GC-ed? and the WASM is still present? ). I didn’t get my answer with Chrome Dev Tools (Memory tab) - maybe I am missing something.

Q3: How do I solve 5? :D Unload WASM => destroy bindings?

Q4: I know about using Emscripten Sanitizer , this is how I discovered that I have memory leaks for bindings. Do you recommend other tools?

Please let me know if I can give you any other information. Thanks!

brendandahl commented 1 year ago

I'm relatively new to the project, so I don't have a good idea of all the pieces that would need to be torn down. Maybe @sbc100 or @kripken could provide more info or know if anyone else has tried this?

It seems if you're going to try and release/reset everything it would be easier and probably just as fast to destroy the worker and create a new one. I assume there's some requirement for why that's not an option?

neatudarius commented 1 year ago

@brendandahl , thanks for replying!

it would be easier and probably just as fast to destroy the worker and create a new one

Ideally, my application should spawn workers, and yes, it should be one for just loading the WASM file and expose low level APIs (e.g. Skia / GPU integrations). In this case, killing the worker actually fixes all problems. Unfortunately, on some specific environments, I must use a single worker for multiple purposes - including the WASM loading. Killing this worker is not possible in thes cases, so I would like to: 1) be able to set a clean state - unload WASM and free all related emscripten resources (and when the user triggers an action, reload again everything). 2) also use a tool (like the Emscripten Sanitizer) to write memory leaks integration-like tests. From my previous experience, it's very important to find and fix leaks, but it's pointless if you cannot automate the tracking / regression.

kripken commented 1 year ago

If you can't destroy the worker, building with MODULARIZE gives you a function that creates an instance of the entire module. When nothing refers to such an instance the browser will GC it. But you do need to be careful about having any links to it that you forget about (the devtools memory tab can help track those down).

andyb1979 commented 1 year ago

Hi everyone, facing similar issues and need to delete all wasm memory once finished with a wasm module. Due to how third party code & libraries interact with our wasm code it's not always possible or easy to cleanup & ensure zero references to wasm Instance.

Ideally I'd like to not have to recompile the module but just create new instances in future, so something like a

Module.delete();
// or better
wasmInstance.delete();

would be great. I'm compiling with MODULARIZE. Does something like this in emsdk exist?

Related question & comment here

sbc100 commented 1 year ago

it's not always possible or easy to cleanup & ensure zero references to wasm Instance.

As long as references to module instance exist its not possible for the engine to free the associated code and wasm memory. There would be no way to emscripten to provide a usefull .delete() method since the GC is fully under the control of the VM.

(Having said that, I'm not sure how the VM treats things like workers/pthreads, it might be worth at least calling Module['exit'] or Module['abort'] which will bring down all the pthread workers).

andyb1979 commented 1 year ago

Thanks sam appreciate it. We're working on cleaning up every reference to module instance (which is passed around to external code) by proxying the instance object. I think we're struggling with some canvas references being held as well. Thanks for the info about limitations and your comments on #19470 as well

We're not using workers/threads, but worthwhile to exit the module anyway. I was unable to find Module['exit'] or 'abort' in the generated js glue-code. Is this function on the Module or the instance? Thanks!

sbc100 commented 1 year ago

You would want to add _exit and/or _abort to your EXPORTED_FUNCTIONS list.