Dynamic loading modules which come with JS glue code

shanumante-sc commented 2 years ago

We want to separate out our networking library (libNetwork) into a separate dlopen-able module.

libNetwork provides the following API:

// INetworkProvider.hpp

struct Response { int code; const char* data; int length; ...};
using ResponseCallback = void (*)(void*, Response*);
...
class INetworkProvider {
   virtual void make_request(const char* url, ResponseCallback callback);
   ...
};

INetworkProvider* create_network_provider();

void destroy_network_provider(INetworkProvider*);

Other libraries (like libMyApplication) uses libNetwork as follows: Use INetworkProvider.hpp from libNetwork repository during compilation Run the code below to get a concrete NetworkProvider (omitting error-checking code):

std::shared_ptr<INetworkProvider> createNetworkProvider() {
    auto* library = dlopen("libNetwork.so", RTLD_NOW);
    createFn_ = dlsym(library, "create_network_provider");
    destroyFn_ = dlsym(library, "destroy_network_provider");
    return std::shared_ptr<INetworkProvider>{(*createFn_)(), destroyFn_};
}

The above code sits inside libMyApplication. Once libMyApplication has a pointer to INetworkProvider , it can make network requests.

Constraints

The API provided by libNetwork is pure-C except for use of virtual functions (which is assumed to be stable across emscripten versions for now)
libNetwork and libMyApplication might use different versions of the emscripten toolchain for compilation
libNetwork has some JS glue code when compiled to wasm which is used to proxy the actual networking calls.
libMyApplication needs libNetwork only in some special cases. Hence, we want to download and use libNetwork at runtime only when the application needs to do a network call.

Looking at official emscripten documentation (https://emscripten.org/docs/compiling/Dynamic-Linking.html) it looks like we can only enable dynamic linking with a side module. However, this doesn’t fit well with our use case above for the following reasons:

A side module must be pure-wasm without JS glue code
In our case, we prefer that the application module and network module each have their own system libraries without sharing (since they can be using different emscripten versions). Looking at documentation, the side module wouldn’t have system libraries included.

Possible solutions

Ideal

In an ideal world, we can compile both libNetwork and libMyApplication as main modules, and libMyApplication can still dlopen libNetwork. However, based on the documentation, that has some unhandled corner cases.

Acceptable

Allow side modules to have JS glue code and system library code (for the system libraries being used). In that case, we can build libNetwork as a side module with JS glue code, and then load that dynamically at runtime from libMyApplication. Some mechanism needs to be provided to load the associated JS glue code.

kripken commented 2 years ago

This could be interesting to experiment with, basically to copy to the Module object things on a JS "side module". I am a little unsure how easy it would be, though, as parts of the JS library do things like add global-level code (search for __postset) and assume things about scoping.

sbc100 commented 2 years ago

It sounds to me like what you are asking for is something that is not possible with any dynamic linking system out there today. i.e. two difference copies of libc in a single application sharing a single memory. The easiest why to see why this is tricky is to consider that this means two different copies of malloc and free. In emscripten, as with all dynamic linking systems I know of, there is always just one copy of libc (and indeed just one copy of any given library).

I think it might be a better approach to push for more compatibility between side modules and main module built with different emscripten versions. Have you run into real world issues with this kind of compatibility over time? Perhaps we could maintain some kind of ABI version scheme that could allow is to break compatibility at certain known intervals (thus triggering the a rebuilding of the world). Would this approach work for you?

sbc100 commented 2 years ago

If you really want to use two different main modules than I think dlopen() is not the way to go want it would more like two completely separate applications/modules communicating via RPC via some customs JS glue that you write.

shanumante-sc commented 2 years ago

Thanks for the feedback!

It sounds to me like what you are asking for is something that is not possible with any dynamic linking system out there today. i.e. two difference copies of libc in a single application sharing a single memory

Maybe I am misunderstanding your concern - But if we statically link libc runtime into a shared library, strip unneeded global symbols, and then dynamically load this into an executable which itself has libc runtime, there shouldn't be any problem in loading/running the library. We use this in android where multiple shared libraries with their own (and potentially different) versions of libc/libc++ statically linked are loaded into the same application.

Have you run into real world issues with this kind of compatibility over time?

We haven't tried it yet because it would be undefined behavior. By "system libraries" I assume you are referring to libc/libc++ - since libMyApplication and libNetwork use different STL versions/emscripten toolchain, it would be dangerous to assume any call into libc++ from libNetwork can use the definition in libFoo and work as expected.

If you really want to use two different main modules than I think dlopen() is not the way to go want it would more like two completely separate applications/modules communicating via RPC via some customs JS glue that you write.

Yes, that is one approach we are considering if dlopen isn't feasible - but it does add overhead in terms of performance because we need to cross over into JS each time to communicate between the modules.

This could be interesting to experiment with, basically to copy to the Module object things on a JS "side module"

Is there an example for this, or some pointers around how we might go about doing this?

sbc100 commented 2 years ago

We haven't tried it yet because it would be undefined behavior. By "system libraries" I assume you are referring to libc/libc++ - since libMyApplication and libNetwork use different STL versions/emscripten toolchain, it would be dangerous to assume any call into libc++ from libNetwork can use the definition in libFoo and work as expected.

What do you think that would be undefined behaviour? In theory, the ABI of libc and libc++ remains stable over time does't it? For example, I can build a binary on my old linux system against glibc N and then run it on my new system with glibc N+1, and in most cases this is expected to work (there are some cases where the ABI breaks but in general its stable).

Most applications do no statically link libc or libc++, right? That just use the one that OS provides, which might not be the precise one they were built againt.

Having said all that emscripten has not attempted this kind of ABI stability yet, so it would require some effort on our part.

sbc100 commented 2 years ago

Maybe I am misunderstanding your concern - But if we statically link libc runtime into a shared library, strip unneeded global symbols, and then dynamically load this into an executable which itself has libc runtime, there shouldn't be any problem in loading/running the library. We use this in android where multiple shared libraries with their own (and potentially different) versions of libc/libc++ statically linked are loaded into the same application.

I'm very surprised that this works at all. That only way I can imagine this working is if you have very strict rules about passing data between the two parts of the application. e.g. you could not pass anything C library structures or C++ objects and you could not pass object that you expect the other side of free (since you have two different allocators in the same application). These strict restrictions, I imagine, would make widespread use of this technique difficult.

On the subject of the allocator on libc, this is even less likely to be workable on emscripten where our allocators are all based on sbkr() rather than mmap(). We have a single heap space that grows with sbkr() and only a single allocator can be in charge of this. In theory the side module could use a custom libc with a mmap-based allocator but then the mmap implementation would need to come from the main module.. so we could have part of libc shared and parts embedded.

shanumante-sc commented 2 years ago

That only way I can imagine this working is if you have very strict rules about passing data between the two parts of the application.

Yep, we only allow passing user-defined PODs across the API boundary so that the struct layout is the same even when using different libc versions, and memory allocated inside one library must also be free'd by the same library (ie we can't allocate something in one .so and transfer ownership to another .so). And yes, it does make the API a bit tedious, but the API surface is small enough that it hasn't been an issue in practice till now.

the ABI of libc and libc++ remains stable over time does't it

Per my understanding, ABI is stable for libc, but I'm not sure about libc++. For example, if we were using std::vector in libA.so and libB.so, and they are built using different libc++ versions, there's no guarantee that I can pass a vector created by libA into libB, even by reference (because the vector layout could have changed between those versions)

On the subject of the allocator on libc, this is even less likely to be workable on emscripten where our allocators are all based on sbkr() rather than mmap(). We have a single heap space that grows with sbkr() and only a single allocator

I see, this seems like a difficult one to work around.

So overall, it seems like building our own proxy layers in JS is the only viable option in the case where we can't unify the emscripten/libc/libc++ versions across our libraries.

So then assuming we are able to unify these versions, is there a solution for allowing a side module to also have its own JS glue code which is loaded when the module is loaded?

sbc100 commented 2 years ago

For the libc/libc++ issue I think there are two options:

Ensure all modules that get linked a built with the same emscripten version.
Rely on the guarantees from emscripten that libc/libc++ ABI will remain largely stable (with explicit documented breakages only) over time.

(1) would be the safest option, but I think we can and should be moving the direction of (2) in any case so I think its a reasonable request. And indeed without these kind of ABI stability side modules will never be shareable or publishable in the general case.

For the JS issue I think we can consider it separately. Are you talking about EM_JS/ASMJS block here? I think that we may be able to make them work in some circumstances. Specifically, if the side modules can be passed to the linker when main module is linked? Is this feasible for your use case? Essentially the JS code would be extracted an included in the main module JS .. which would mean that side modules could not not updated without re-linking the main module. Would this limitation work in your case?

sbc100 commented 2 years ago

Per my understanding, ABI is stable for libc, but I'm not sure about libc++. For example, if we were using std::vector in libA.so and libB.so, and they are built using different libc++ versions, there's no guarantee that I can pass a vector created by libA into libB, even by reference (because the vector layout could have changed between those versions)

My understanding is that libstdc++ and libc++ both provide ABI stability guarantees that mean that this is a reasonable thing to do. Without this it wouldn't be possible for C++-based applications/libraries such as QT to be dynamically linked against libc++/libstdc++ and share a single version of the library at runtime (which they do).

See https://libcxx.llvm.org//DesignDocs/ABIVersioning.html

shanumante-sc commented 2 years ago

Rely on the guarantees from emscripten that libc/libc++ ABI will remain largely stable (with explicit documented breakages only) over time.

Yes, that could potentially work, and something we can experiment with it based on the link you shared re libc++ ABI stability.

For the JS issue I think we can consider it separately. Are you talking about EM_JS/ASMJS block here?

We have the following in our side modules:

EM_JS
embind (And also support EMSCRIPTEN_BINDINGS-based initialization)

And maybe in the future

Other emscripten-provided wrappers like libworkerfs, opengles->webgl interop etc

Specifically, if the side modules can be passed to the linker when main module is linked? Is this feasible for your use case?

By if the side modules can be passed to the linker, do you mean providing generated glue code for the side module (once that is supported) to the main module, or something else? If it is the former, then it should be possible.

Essentially the JS code would be extracted an included in the main module JS .. which would mean that side modules could not not updated without re-linking the main module. Would this limitation work in your case?

Yeah, that sounds like a reasonable approach. It does bloat the main module JS slightly, but we can definitely live with it for now.

emscripten-core / emscripten