emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.64k stars 3.29k forks source link

Opening a side module larger than 4KB fails on Chrome #11753

Open cuinjune opened 4 years ago

cuinjune commented 4 years ago

Hi, I'm trying to dynamically load a side module from the main module using dlopen(). The side module loads fine as long as its size is smaller than 4KB but I need to load large-size side modules. Here's a simple code you can test this:

side.c:

#define SIZE 4000
char dummy[SIZE] = {};

int side(int a)
{
    return SIZE;
}

main.c:

#include <stdio.h>
#include <dlfcn.h>

int main() 
{
    void *handle;
    typedef int (*func_t)(int);

    handle = dlopen("side.wasm", RTLD_NOW);

    if (!handle) {
        printf("failed to open the library\n");
        return 0;
    }
    func_t func = (func_t)dlsym(handle, "side");

    if (!func) {
        printf("failed to find the method\n");
        dlclose(handle);
        return 0;
    }
    printf("side module size: %d byte\n", func(1));
}

index.html:

<!DOCTYPE html>
<html>

<head>
</head>

<body>
  <script async src="main.js"></script>
</body>

</html>

commands:

emcc side.c -s SIDE_MODULE=1 -o side.wasm
emcc main.c -s MAIN_MODULE=1 -o main.html --preload-file side.wasm
python3 -m http.server 8080

And this is the result I get in the Chrome browser:

Error in loading dynamic library side.wasm: RangeError: WebAssembly.Compile is disallowed on the main thread, if the buffer size is larger than 4KB. Use WebAssembly.compile, or compile on a worker thread.

Can someone please guild me on how to dynamically load a side module larger than 4KB? Thank you in advance!

gerald-dotcom commented 4 years ago

It's a Chrome limitation, you can use loadDynamicLibrary(url, {loadAsync: true, global: true, nodelete: true}) but there is one thing that I'm missing to make it work. Maybe someone can help you out.

kripken commented 4 years ago

It looks like the preloading logic looks for the normal shared library suffix on POSIX, .so,

https://github.com/emscripten-core/emscripten/blob/df56ba11fe8dbae3949446239affe69b2efa7772/src/library_browser.js#L241

We should document that if it isn't already.

cuinjune commented 4 years ago

@sbc100 Could you please help me? I would really appreciate it!

sbc100 commented 4 years ago

Can you try using the .so extension as @kripken suggested? It looks like there is some special handling to preload side modules but it looks for the .so extension.

BTW in this mode you are going to delay the startup of you program until both the side module and main module are instantiated, so you are not really getting any benefit at runtime from using the a side module. Would it make more sense to simply compile the side module into the main program?

cuinjune commented 4 years ago

@sbc100 How do I build a side module as the so extension? Can I just replace wasm with so like the following?

emcc side.c -s SIDE_MODULE=1 -o side.so

I need the side module to exist separately from the main module, and the side module will be instantiated in the middle of the program not necessarily at the startup.

sbc100 commented 4 years ago

Yes -o side.so should work. Better still. -o libside.so to match the UNIX convension.

If you preload the file with --preload-file then the side module will actually be instantiated at startup.. even though logically you don't access it until later. It should be be observable to you program other than the that the startup will be delayed until the side module has been downloaded and compiled. Does the work for you?

cuinjune commented 4 years ago

Yes, It's okay if there's a delay. I'm using the --preload-file for testing purposes now, but the side modules should be dynamically added to the file system and can be loaded in the middle of the program. That's why I'm using the side modules. And I already tested with the side modules smaller than 4KB and they work fine.

I used these commands to build:

emcc side.c -s SIDE_MODULE=1 -c -o side.so
emcc main.c -s MAIN_MODULE=1 -o main.html --preload-file side.so
python3 -m http.server 8080

Now I got these errors when I run it in Chrome:

Assertion failed: need the dylink section to be first

Error in loading dynamic library side.so: RuntimeError: abort(Assertion failed: need the dylink section to be first) at Error at jsStackTrace (main.js:2568) at stackTrace (main.js:2586) at abort (main.js:2288) at assert (main.js:1329) at loadWebAssemblyModule (main.js:756) at createLibModule (main.js:650) at getLibModule (main.js:669) at loadDynamicLibrary (main.js:708) at _dlopen (main.js:7686) at __original_main (:8080/:wasm-function[153]:0x6c7e7)

cuinjune commented 4 years ago

If loading the large-size side module with dlopen() is not possible, is there any workaround to this? (e.g. loading the .wasm file from index.html?)

sbc100 commented 4 years ago

Did you remember to change the name of side module in main.c?

Its should work with reloading such that side.so is loaded at start and you should not see a runtime call to . loadWebAssemblyModule on dlopen becasue side.so is already included in Module['preloadedWasm']. See https://github.com/emscripten-core/emscripten/blob/8bd9c9379d5697c06b72d549dfc28f47ff6fcb6c/src/library_browser.js#L252

If that doesn't work you can try using the asynrous JS function loadDynamicLibrary as suggested earlier in this thread. However the dlopen+preload approach should work if you want it to.

cuinjune commented 4 years ago

@sbc100 Yes I did of course. Here's what everything looks like:

side.c:

#define SIZE 3000
char dummy[SIZE] = {};

int side(int a)
{
    return SIZE;
}

main.c:

#include <stdio.h>
#include <dlfcn.h>

int main()
{
    void *handle;
    typedef int (*func_t)(int);

    handle = dlopen("side.so", RTLD_NOW);

    if (!handle) {
        printf("failed to open the library\n");
        return 0;
    }
    func_t func = (func_t)dlsym(handle, "side");

    if (!func) {
        printf("failed to find the method\n");
        dlclose(handle);
        return 0;
    }
    printf("side module size: %d byte\n", func(1));
}

index.html:

<!DOCTYPE html>
<html>

<head>
</head>

<body>
  <script async src="main.js"></script>
</body>

</html>

commands:

emcc side.c -s SIDE_MODULE=1 -c -o side.so
emcc main.c -s MAIN_MODULE=1 -o main.html --preload-file side.so
python3 -m http.server 8080

Result:

Assertion failed: need the dylink section to be first

Error in loading dynamic library side.so: RuntimeError: abort(Assertion failed: need the dylink section to be first) at Error at jsStackTrace (main.js:2568) at stackTrace (main.js:2586) at abort (main.js:2288) at assert (main.js:1329) at loadWebAssemblyModule (main.js:756) at createLibModule (main.js:650) at getLibModule (main.js:669) at loadDynamicLibrary (main.js:708) at _dlopen (main.js:7686) at __original_main (:8080/:wasm-function[153]:0x6c7e7)

cuinjune commented 4 years ago

I tried to use the loadDynamicLibrary:

<!DOCTYPE html>
<html>

<head>
</head>

<body>
  <script>
    loadDynamicLibrary("side.wasm", {loadAsync: true, global: true, nodelete: true})
  </script>
  <script async src="main.js"></script>
</body>

</html>

But I get this error:

Uncaught ReferenceError: loadDynamicLibrary is not defined

gerald-dotcom commented 4 years ago

I tried to use the loadDynamicLibrary:

<!DOCTYPE html>
<html>

<head>
</head>

<body>
  <script>
  loadDynamicLibrary("side.wasm", {loadAsync: true, global: true, nodelete: true})
  </script>
  <script async src="main.js"></script>
</body>

</html>

But I get this error:

Uncaught ReferenceError: loadDynamicLibrary is not defined

Use EM_JS and rename .wasm to .so

cuinjune commented 4 years ago

@gerald-dotcom I still get the same error:

Uncaught (in promise) RuntimeError: abort(Assertion failed: need the dylink section to be first) at Error at jsStackTrace (http://localhost:8080/main.js:2568:17) at stackTrace (http://localhost:8080/main.js:2586:16) at abort (http://localhost:8080/main.js:2288:44) at assert (http://localhost:8080/main.js:1329:5) at loadWebAssemblyModule (http://localhost:8080/main.js:756:3) at createLibModule (http://localhost:8080/main.js:650:12) at http://localhost:8080/main.js:665:16 at abort (http://localhost:8080/main.js:2294:11) at assert (http://localhost:8080/main.js:1329:5) at loadWebAssemblyModule (http://localhost:8080/main.js:756:3) at createLibModule (http://localhost:8080/main.js:650:12) at http://localhost:8080/main.js:665:16

Here's my code:

main.c:

#include <stdio.h>
#include <dlfcn.h>
#include <emscripten.h>

EM_JS(void, loadLibrary, (), {
      loadDynamicLibrary("side.so", {loadAsync: true, global: true, nodelete: true});
});

int main()
{
    loadLibrary();
}

commands:

emcc side.c -s SIDE_MODULE=1 -c -o side.so
emcc main.c -s MAIN_MODULE=1 -o main.html --preload-file side.so
python3 -m http.server 8080
sbc100 commented 4 years ago

Strange... somehow the side module is missing the dylink section which emscripten adds to the beginning of it.

cuinjune commented 4 years ago

I also posted my question on Stackoverflow https://stackoverflow.com/q/63150966/5224286 with a bounty.

sbc100 commented 4 years ago

Ah! The problem is that you are passing -c in emcc side.c -s SIDE_MODULE=1 -c -o libside.so which means "compile to an object file". This means the resulting side.so is not a shared libray but just an object file.

If you remove -c it should work.

Sadly it looks like you also have to change the output name back to .wasm to persuade emcc to produce a side module. We should fix that part.

cuinjune commented 4 years ago

@sbc100 But if I remove -c then it doesn't compile. command: emcc side.c -s SIDE_MODULE=1 -o side.so

I get the following error:

emcc: error: SIDE_MODULE must only be used when compiling to an executable shared library, and not when emitting an object file. That is, you should be emitting a .wasm file (for wasm) or a .js file (for asm.js). Note that when compiling to a typical native suffix for a shared library (.so, .dylib, .dll; which many build systems do) then Emscripten emits an object file, which you should then compile to .wasm or .js with SIDE_MODULE.

sbc100 commented 4 years ago

Sadly it looks like you also have to change the output name back to .wasm to persuade emcc to produce a side module. You will then need to rename the file .so when to preload it. We should fix that part.

sbc100 commented 4 years ago

Apologies for that very hard to understand error message... the situation with linking .so and .dylib extensions is complicated due to the history emscripten and keeping compataiblity with build systems that want to use those extensions but don't want side modules.

cuinjune commented 4 years ago

So is it not possible to load a side module larger than 4KB at the moment?

sbc100 commented 4 years ago

No, its possible, you just need to do the instantiation asynchronously because the browser doesn't support the synchronous loading of larger modules. This limitation is built into the browser and not related to emscripten.

Both loadDynamicLibrary (with loadAsync: true) and dlopen (in combination with the emscripten feature that pre-loads .so files) should allow for this asynchronous loading.

cuinjune commented 4 years ago

Thank you. so how can I generate the .so file? Should I emcc side.c -s SIDE_MODULE=1 -o side.wasm and then rename the side.wasm to side.so?

sbc100 commented 4 years ago

Yes, if you want to use dlopen in combination with the preload plugin system that seems necessary today.

I will work two changes:

  1. Allow linking side modules directly to .so.
  2. Allow .wasm files to be preloaded by the preload plugin.

But you should be able to get it working even without either of those changes.

cuinjune commented 4 years ago

Thanks, I tried it as you said. Here's my code:

main.cpp

#include <stdio.h>
#include <emscripten.h>
#include <emscripten/bind.h>

using namespace emscripten;

int side(int a);

void hello() {
    printf("side module size: %d byte\n", side(1));
}

EMSCRIPTEN_BINDINGS(my_module) {
    function("hello", &hello);
}

EM_JS(void, loadLibrary, (), {
      loadDynamicLibrary("side.so", {loadAsync: true, global: true, nodelete: true});
});

int main()
{
    loadLibrary();
}

index.html

<!DOCTYPE html>
<html>

<head>
</head>

<body>
  <button id="buttonPressed">call side() function</button>
  <script>
    var Module
      = {
      preRun: []
      , postRun: []
      , print: function (e) {
        1 < arguments.length && (e = Array.prototype.slice.call(arguments).join(" "));
        console.log(e);
      }
      , printErr: function (e) {
        1 < arguments.length && (e = Array.prototype.slice.call(arguments).join(" "));
        console.error(e)
      }
    };

    function buttonPressed() {
      Module.hello();
    }

    window.onload = async () => {
      document.getElementById("buttonPressed").addEventListener("click", buttonPressed, false);
    };
  </script>
  <script async src="main.js"></script>
</body>

</html>

commands:

emcc side.c -s SIDE_MODULE=1 -o side.wasm
mv -f side.wasm side.so
emcc --bind main.cpp -s MAIN_MODULE=1 -o main.html --preload-file side.so
python3 -m http.server 8080

After the page is loaded, if I press the call side() function button, I get the following error:

main.js:2294 Uncaught RuntimeError: abort(external function '_Z4sidei' is missing. perhaps a side module was not linked in? if this function was expected to arrive from a system library, try to build the MAIN_MODULE with EMCC_FORCE_STDLIBS=1 in the environment) at Error

What did I do wrong?

sbc100 commented 4 years ago

That looks like a name mangling issue. If you compile your side module as C rather than C++ then you need to also add extern "C" to the declaration in your C++ code (main.cpp):

extern "C" {
  int side(int a);
}
sbc100 commented 4 years ago

While working on some fixed in this area I noticed that the preload+dlopen approach has some other issues so its good that proceed with the loadDynamicLibrary approach for now until we can address them.

cuinjune commented 4 years ago

It worked! Thank you so much!

May I ask how to implement a callback function to detect if the .so file is successfully loaded?

sbc100 commented 4 years ago

Looks like you can wait on the returned promise:

// - if flags.loadAsync=true, the loading is performed asynchronously and        
//   loadDynamicLibrary returns corresponding promise.     
cuinjune commented 4 years ago

Thank you so much! Is it not possible to dynamically call a side module's function (side(1);) without declaring it (int side(int a);) in the main module? This is the reason why I was using dlopen() so I can dynamically call a function with string.

sbc100 commented 4 years ago

Try using dlsym with RTLD_DEFAULT is the first argument. This will search global namespace of all symbols and should contains your side symbol after the code has been loaded.

cuinjune commented 4 years ago

Thank you very much! I will try that.

Can I ask you one more question? My app should work synchronously, so it should pause all other processes until the side.so is loaded then the side() function is called. Would it work synchronously if I use async/await?

Here's my example code that works:

#include <stdio.h>
#include <emscripten.h>
#include <emscripten/bind.h>

using namespace emscripten;

extern "C" {
  int side(int a);
}

void hello() {
    printf("side module size: %d byte\n", side(1));
}

EMSCRIPTEN_BINDINGS(my_module) {
    function("hello", &hello);
}

EM_JS(void, loadLibrary, (), {
      async function doLoadLibrary() {
        try {
          await loadDynamicLibrary("side.so", {loadAsync: true, global: true, nodelete: true});
          Module.hello();
        }
        catch(error) {
          console.log(error);
        }
      }
      doLoadLibrary();
});

int main()
{
    loadLibrary();
    // do something after side(1) is called. (Can I expect this to always happen after the side(1) is called?)
}

ADDED: I just checked that if I print something after loadLibrary() then it gets printed to the console first before side(1) is called so I guess my approach doesn't work.

What I want is, for example if I load a.so and then b.so, it should be guaranteed that a() is called before b().

gerald-dotcom commented 4 years ago

Thank you very much! I will try that.

Can I ask you one more question? My app should work synchronously, so it should pause all other processes until the side.so is loaded then the side() function is called. Would it work synchronously if I use async/await?

Here's my example code that works:

#include <stdio.h>
#include <emscripten.h>
#include <emscripten/bind.h>

using namespace emscripten;

extern "C" {
  int side(int a);
}

void hello() {
    printf("side module size: %d byte\n", side(1));
}

EMSCRIPTEN_BINDINGS(my_module) {
    function("hello", &hello);
}

EM_JS(void, loadLibrary, (), {
      async function doLoadLibrary() {
        try {
          await loadDynamicLibrary("side.so", {loadAsync: true, global: true, nodelete: true});
          Module.hello();
        }
        catch(error) {
          console.log(error);
        }
      }
      doLoadLibrary();
});

int main()
{
    loadLibrary();
    // do something after side(1) is called. (Can I expect this to always happen after the side(1) is called?)
}

ADDED: I just checked that if I print something after loadLibrary() then it gets printed to the console first before side(1) is called so I guess my approach doesn't work.

What I want is, for example if I load a.so and then b.so, it should be guaranteed that a() is called before b().

EM_JS creates function behind the scenes. To use async await syntax, you have to use Asyncify but last time I checked, I can't compile https://github.com/emscripten-core/emscripten/issues/11717 so I went with callbacks that use Asyncify but with handle sleep. ..

cuinjune commented 4 years ago

@gerald-dotcom Thank you so much! Using Asyncify worked for me! Here's my working code:

main.c

#include <stdio.h>
#include <emscripten.h>

int side(int a);

EM_JS(void, loadLibrary, (), {
      Asyncify.handleAsync(async () => {
        try {
          await loadDynamicLibrary("side.so", {loadAsync: true, global: true, nodelete: true});
        }
        catch(error) {
          console.log(error);
        }
      });
});

int main()
{
    loadLibrary();
    printf("side module size: %d byte\n", side(1));
}

commands: emcc main.c -s MAIN_MODULE=1 -s ASYNCIFY -s 'ASYNCIFY_IMPORTS=["loadLibrary"]' -o main.html --preload-file side.so

Now I need to figure out how to dynamically call the side() function using dlsym with RTLD_DEFAULT as @sbc100 suggested.

cuinjune commented 4 years ago

@sbc100 I tried using dysym with RTLD_DEFAULT but it doesn't work.

main.c:

#include <stdio.h>
#include <dlfcn.h>
#include <emscripten.h>

EM_JS(void, loadLibrary, (), {
      Asyncify.handleAsync(async () => {
        try {
          await loadDynamicLibrary("side.so", {loadAsync: true, global: true, nodelete: true});
        }
        catch(error) {
          console.log(error);
        }
      });
});

int main()
{
    printf("before\n");
    loadLibrary();
    printf("after\n");
    typedef int (*func_t)(int);
    func_t func = (func_t)dlsym(RTLD_DEFAULT, "side");
    if (!func) {
        printf("failed to find the method\n");
        return 0;
    }
    printf("side module size: %d byte\n", func(1));
}

result:

before after failed to find the method (Tried to dlsym() from an unopened handle: 0)

I think I did everything correctly. Any idea why it doesn't work?

sbc100 commented 4 years ago

Ah, I guess emscripten does not support RTLD_DEFAULT.

You could fix this by adding support for dlsym, or if it works for you you could call the function via EM_JS since Module['side'] should contain this function after you load the code.

cuinjune commented 4 years ago

@sbc100 Thank you so much! I could finally load a side module larger than 4KB and call its function synchronously. And I could use .wasm instead of .so.

Here's my full working code:

side:

#define SIZE 5000
char dummy[SIZE] = {};

int side()
{
    return SIZE;
}

main.c:

#include <stdio.h>
#include <dlfcn.h>
#include <emscripten.h>

EM_JS(void, loadLibrary, (const char *name), {
      Asyncify.handleAsync(async () => {
        try {
          var str = UTF8ToString(name);
          await loadDynamicLibrary(str + '.wasm', {loadAsync: true, global: true, nodelete: true});
            console.log('side module size: ' + Module['_' + str]());
        }
        catch(error) {
          console.log(error);
        }
      });
});

int main()
{
    printf("before\n");
    loadLibrary("side");
    printf("after\n");
}

index.html:

<!DOCTYPE html>
<html>

<head>
</head>

<body> 
  <script async src="main.js"></script>
</body>

</html>

commands:

emcc side.c -s SIDE_MODULE=1 -o side.wasm
emcc main.c -s MAIN_MODULE=1 -s ASYNCIFY -s 'ASYNCIFY_IMPORTS=["loadLibrary"]' -o main.html --preload-file side.wasm
python3 -m http.server 8080

result:

before side module size: 5000 after

cuinjune commented 4 years ago

@sbc100 I just found out loadDynamicLibrary() only works if the side module is preloaded. If I dynamically upload a side module using Module["FS_createDataFile"]() and try to load it, I get the following error:

failed to load binary file at '/side.wasm'

However, loading the dynamically-uploaded side module works fine if I use dlopen() instead of loadDynamicLibrary() as long as the file size is smaller than 4KB.

Can loadDynamicLibrary() only load preloaded side modules? If so, would there be a workaround to load a dynamically-uploaded side module larger than 4KB?

cuinjune commented 4 years ago

@sbc100 Can I expect the above problem to be fixed at some point?

sbc100 commented 4 years ago

We would be happy to assist you if you want to propose a change to loadDynamicLibrary, but it could be that quickest way to fix is to dig into the problem yourself.

It could be that re-structuring your program to not depend on dynamic loading might be the fastest way to get stuff working. Can you describe your use case a little? Does it really depend on the loading symbols by string name at runtime? Or can we find a way to work around this requirement?

cuinjune commented 4 years ago

We are working on making a visual programming software called Purr Data to run in a browser. Natively, the software can dynamically load external libraries built by users using dlopen().

Our web version of the software will have a file manager where users can dynamically upload their projects with their external libraries just like how it works natively.

Currently, I could confirm that it is possible to dynamically load .wasm side modules using dlopen() but it doesn't work in Chrome if the file size is larger than 4KB because dlopen() cannot load files asynchronously.

I could load a large-sized file using loadDynamicLibrary() but it only works if the file is preloaded.

I would like to try to fix this in emscripten and propose a change if I can but I would appreciate it if you can guide me on how I should approach this.

  1. It looks like dlopen() internally calls loadDynamicLibrary(). What if I add loadAsync: true flags when calling the function? Would it make dlopen() to load large-sized files and dynamically?

  2. What makes it possible for dlopen() to load files dynamically whereas loadDynamicLibrary() can't? Would it be because of the loadAsync: true flag which uses fetchBinary() instead of readBinary()? (former uses fetch() and latter uses nodeFS['readFileSync']())

  3. What if I change nodeFS['readFileSync']() to nodeFS['readFileASync']() in readBinary()?

sbc100 commented 4 years ago

I see. Your use case does indeed look like one that requires genuine runtime dynamic linking. You are effectively loading plugins into your program at runtime.

  1. dlopen in inherently synchronous and WebAssembly module instantiation in inherently asynchronous (at least for modules over 4k). Those two facts are not changeable. However emscripten does have a feature called asyncify which allows can transform seeming synchronous native calls into asynchronous ones: https://emscripten.org/docs/porting/asyncify.html. So it should be possible in theory for dlopen to use loadAsync: true when running in asyncify mode.

  2. I have clues here I'm afraid. This is certainly something that seems fixable.

  3. I'm not familiar with that code I'm afraid.

One general piece of advice is perhaps to look at your code in two parts: (1) that native code. (2) the web/JS part that run the native code. It might make more sense on the web do to all your plugin loading on the JS side where you can make async calls and use loadDynamicLibrary. It might also be possible then to avoid the filesystem completely and just load code from URLs. The filesystem is there for native emulation of stuff like fopen/fread etc. But if you just want to download and load code in JS then perhaps you don't need to both to write your code into the filesytem at all.

cuinjune commented 4 years ago

@sbc100 Thank you so much for your reply. It was super helpful!

I finally got it working! I needed to also pass fs: FS flag into theloadDynamicLibrary() function. Here's the full working example code in case someone will find it useful:

main.c:

#include <stdio.h>
#include <emscripten.h>

EM_JS(void, doLoadLibrary, (), {
    Asyncify.handleAsync(async() => {
    try {
        await loadDynamicLibrary('side.wasm', { loadAsync: true, global: true, nodelete: true, fs: FS });
        console.log('side module size: ' + Module['_side']());
    }
    catch (error) {
        console.log(error);
    }
});
    });

EMSCRIPTEN_KEEPALIVE
void loadLibrary() {
    printf("before\n");
    doLoadLibrary();
    printf("after\n");
}

int main()
{
}

side.c:

#define SIZE 5000
char dummy[SIZE] = {};

int side(void)
{
    return SIZE;
}

index.html:

<!DOCTYPE html>
<html>

<head>
</head>

<body>
  <input id="uploadLibrary" type="file" />
  <button id="loadLibrary">loadLibrary</button>
  <script>

    function uploadLibrary() {
      var files = this.files;
      if (files.length === 0) {
        console.log("No file is selected");
        return;
      }
      var file = files[0];
      var reader = new FileReader();
      reader.onload = function () {
        var data = new Uint8Array(reader.result);
        Module["FS_createDataFile"]("/", file.name, data, true, true, true);
      };
      reader.readAsArrayBuffer(file);
    }

    function loadLibrary() {
      Module._loadLibrary();
    }

    document.getElementById("uploadLibrary").addEventListener("change", uploadLibrary, false);
    document.getElementById("loadLibrary").addEventListener("click", loadLibrary, false);
  </script>
  <script async src="main.js"></script>
</body>

</html>

Makefile:

all: clean
    mkdir -p side
    emcc side.c -s SIDE_MODULE=1 -o side/side.wasm
    emcc main.c -s MAIN_MODULE=1 -s ASYNCIFY -s "ASYNCIFY_IMPORTS=['doLoadLibrary']" -s FORCE_FILESYSTEM=1 -s "EXTRA_EXPORTED_RUNTIME_METHODS=['FS']" -o main.html
    python3 -m http.server 8080

clean:
    rm -rf side main.html main.js main.wasm main.data

In your browser, click "Choose File" and select the side/side.wasm file, and then click "loadLibrary" button. You will see the following result in the console:

before
side module size: 5000
after

Thank you so much for your help so far. You can close this issue if you want.

sbc100 commented 4 years ago

Thats great news! So glad we figured out a method that worked.

Btw how are you then fishing the symbols out of the code? I think we still probably want to make RTLD_DEFAULT work with dlsym to enable this.. but that can be different issue perhaps?

cuinjune commented 4 years ago

As you can see from my code, I'm currently using Module['_side']() to call the function from a loaded side module and this way seems to work fine.

But if I try to call this function using dlsym and RTLD_DEFAULT, it was not able to call the function.

Here's an example code you can test this. (you just need to replace main.c from my previous example with the following)

main.c:

#include <stdio.h>
#include <dlfcn.h>
#include <emscripten.h>

EM_JS(void, doLoadLibrary, (), {
    Asyncify.handleAsync(async() => {
    try {
        await loadDynamicLibrary('side.wasm', { loadAsync: true, global: true, nodelete: true, fs: FS });
    }
    catch (error) {
        console.log(error);
    }
});
    });

EMSCRIPTEN_KEEPALIVE
void loadLibrary() {
    printf("before\n");
    doLoadLibrary();
    printf("after\n");
    typedef int (*func_t)(void);
    func_t func = (func_t)dlsym(RTLD_DEFAULT, "side");
    if (!func) {
        printf("failed to find the method\n");
        return;
    }
    printf("side module size: %d byte\n", func());
}

int main()
{
}

result:

before
after
failed to find the method

So I guess this needs to be fixed in the future although it is currently still possible to call the function with Module['_side']().

goldwaving commented 3 years ago

I've been struggling with this as well trying to add the LAME MP3 encoder to GoldWave Infinity. We need to keep LAME separate to ensure we comply with the LGPL.

The documentation (at https://github.com/emscripten-core/emscripten/wiki/Linking) gives the impression that dynamic linking is straight forward. For real world apps, it is more complicated. To help others in the future, I'd recommend updating the documentation to:

kripken commented 3 years ago

A PR with doc improvements would be welcome @goldwaving . Or if you do not have time, opening a new issue with your comments might be better, so that others can see it.

mewalig commented 2 years ago

This is insanely frustrating because we ran an initial test using dlopen() and it just so happened that our test .so file was smaller than 4k and so it worked perfectly. Have spent more than the entire day trying to reason out what could be going wrong when we drop our "real" .so file in and finally find out it has to do with the size of the .so and that, literally, if I change a literal value in my c code from "this string" to "this string has more chars", dlopen() goes from succeeding to failing which to anyone without peculiar knowledge of emscripten's bowels, just makes no sense at all

It would be better to simply not pretend to support dlopen() until it properly works, or works for at least a reasonably sized file, than to have it work but only for use cases that are useless 99% of the time. At the very least, please change the documentation so that it is not misleading-- which it entirely is today-- and to clearly spell out those very particular specific cases where this might actually work. This is a very easy problem to reproduce and to clearly define the limits and boundaries for-- can we not spend the few minutes it takes to add some clarification to the documentation to prevent others from unnecessarily going down this dark rabbit hole?

AndyC-NN commented 2 years ago

The easy way to actually do this is to load the module with WebAssembly.instantiate in the javascript side. Use this tutorial to get beyond the 4K limit, https://developers.google.com/web/updates/2018/04/loading-wasm Once the module is loaded, get the function this way:

side = instance.exports._Z4sidev;

So we now have a javascript function of side but this now seems unusable to use in C++, until you realise that Emscripten has a handy function to get the function pointers for C/C++ called addFunction:

sidePtr = addFunction(side, 'vi');

The 'vi' to addFunction is really not that important here. It can be anything. That passes back an int that is a function pointer. Some of you will now wonder how can the int be a pointer? Under the hood of Emscripten compiling all functions in C/C++ after the code is compiled in WASM will be an int. With that knowledge you can just do this in C/C++:

typedef int (*side_function)(); side_function side = (side_function)sidePtr; int size = side();

And somehow compiling with Emscripten this all works great!

aisnote commented 2 years ago

@cuinjune did you meet this kind of issue before?

Seems it is caused by my.wasm building settings?

image

EM_JS(void, doLoadLibrary, (), {
    Asyncify.handleAsync(async() => {
        try
        {
            await loadDynamicLibrary('ff/lib/my.wasm', {loadAsync : true, global : true, nodelete : true, fs : FS});
            //console.log('side module size: ' + Module['_side']());
        }
        catch (error)
        {
            console.log(error);
        }
    });
});
pd-l2ork commented 1 year ago

Following-up on this thread. @kripken I am working on the code base that @cuinjune originally developed and whose critical component is the solution developed in this thread. It appears that building that code on an older version of emscripten allows for loadDynamicLibrary to work. On the latest emscripten version it fails, as follows (please bear with me since I am new to emscripten and am still trying to find my way around, so apologies in advance if I am poorly describing the problem):

1) Whether you build in old or new version, it appears all the objects/plugins that may be loaded later when the programming language runs are "packaged" into the main.js file. By investigating this huge file, generated by emscripten, I find bunch of statements like the one below:

{"filename":"/pd-l2ork/extra/maxlib/average.wasm","start":15274628,"end":15276992}

The fiilename reflects the file's original location on the disk. I may be wrong about this, but it seems to me the last two numbers perhaps reflect where it is packed address-wise in the bundled file created by the compile/linking? Curiously, the older version had also , "audio": 0/1} appended at the end, suggesting the older version also offered differentiation between audio files and objects. Not sure why this is not there when compiled with a newer version, and whether that may be causing the aforesaid problem.

2) When running the demo code snippet on an older version of emscripten, additional objects open without the problem. On the latest version, the same request, instead of looking through its bundled objects, it issues a GET request looking for it inside the public/ folder (I suppose it is trying to wget it). The error on the console is as follows (removed calls before it that are irrelevant--it starts with the custom function call __PD_loadlib whose structure is effectively identical to the example @cuinjune posted above in https://github.com/emscripten-core/emscripten/issues/11753#issuecomment-670384796:

GET http://localhost:3000/emscripten/pd-l2ork-web/extra/maxlib/average.wasm 404 (Not Found)
readAsync   @   main.js:1
asyncLoad   @   main.js:1
(anonymous) @   main.js:1
loadLibData @   main.js:1
getExports  @   main.js:1
loadDynamicLibrary  @   main.js:1
(anonymous) @   main.js:1
(anonymous) @   main.js:1
handleSleep @   main.js:1
handleAsync @   main.js:1
__Pd_loadLib    @   main.js:1

Is the GET being called because the code is somehow not finding the object in question inside the bundle?

3) Copying precompiled wasm objects contained inside pd-l2ork-web inside the public/emscripten folder allows the object to be found via GET, but then opening of the object fails with an error where the main instantiation function symbol is not found. Namely:

error: load_object: Symbol "average_setup" not found

All objects have their objectname_setup as their entry point function and have been compiled as follows (please pardon the redundancy of the makefile flags--this is a huge code base with over a thousand of objects developed by 3rd parties, so the automated build process is a bit messy):

emcc -s SIDE_MODULE=1 -O3 -sEMULATE_FUNCTION_POINTER_CASTS=1 -o "average.wasm" "average.o"   
chmod a-x "average.wasm"

Below is also the relevant code that calls loadDynamicLibrary:

EM_JS(int, __Pd_loadLib, (const char *filename, const char *symname), {
    console.log("Pd-L2Ork __Pd_loadlib filename=<" + UTF8ToString(filename) + "> symname=<" + UTF8ToString(symname) + ">");
    return Asyncify.handleAsync(async () => {
        try {
            await loadDynamicLibrary(UTF8ToString(filename), {loadAsync: true, global: true, nodelete: true, fs: FS});
            var makeout = Module['_' + UTF8ToString(symname)];
            if (typeof makeout === "function") {
                makeout();
                console.log("...success");
                return 1; // success
            }
            else {
                console.log("...no function found");
                return -1; // couldn't find the function
            }
        }
        catch (error) {
            console.log("...failed");
            console.error(error);
            return 0; // couldn't load the external
        }
    });
});

The relevant build script code that bundles everything together on the emscripten side of things is as follows:

-s ASYNCIFY -s "ASYNCIFY_IMPORTS=['__Pd_loadLib']" \
-s USE_SDL=2 -s ERROR_ON_UNDEFINED_SYMBOLS=0 -s ALLOW_MEMORY_GROWTH=1 \
-s FORCE_FILESYSTEM=1 -s "EXTRA_EXPORTED_RUNTIME_METHODS=['FS']" \
--no-heap-copy --preload-file pd-l2ork -L/home/l2orkist/Downloads/pd-l2ork/emscripten/build/../../libpd/libs -lpd -lm

I also updated deprecated EXTRA_EXPORTED_RUNTIME_METHODS to EXPORTED_RUNTIME_METHODS, as follows:

-s ASYNCIFY -s "ASYNCIFY_IMPORTS=['__Pd_loadLib']" \
-s USE_SDL=2 -s ERROR_ON_UNDEFINED_SYMBOLS=0 -s ALLOW_MEMORY_GROWTH=1 \
-s FORCE_FILESYSTEM=1 -s "EXPORTED_RUNTIME_METHODS=['FS']" \
--no-heap-copy --preload-file pd-l2ork-web -L/home/l2orkist/Downloads/pd-l2ork/emscripten/build/../../libpd/libs -lpd -lm
  1. I also tried changing the way objects are built, for instance:
emcc -s SIDE_MODULE=1 -O3 -sEMULATE_FUNCTION_POINTER_CASTS=1 -s EXPORTED_RUNTIME_METHODS=_average_setup -o "average.wasm" "average.o"

Note the EXPORTED_RUNTIME_METHODS. Not sure if these should be prepended with an underscore. I tried both with or without it. Also tried:

-s EXPORTED_FUNCTIONS=average_setup

That one generates the following warning:

emcc: warning: EXPORTED_FUNCTIONS is not valid with LINKABLE set (normally due to SIDE_MODULE=1/MAIN_MODULE=1) since all functions are exported this mode.  To export only a subset use SIDE_MODULE=2/MAIN_MODULE=2 [-Wunused-command-line-argument]

So, I also tried with SIDE_MODULE=2 and EXPORTED_FUNCTIONS with no success. Curiously, this also suggests that all functions are exported. If so, why are they not being found by loadDynamicLibrary call above?

  1. Where does one find the API information on the loadDynamicLibrary? Is this a built-in C function or an emscripten one?

What can be done to be once again able to reference these bundled objects (preferable), or allowing them to be loaded via the GET option (which I imagine is a fallback routine--please correct me if I may be wrong)?

I would greatly appreciate any input you may have here, as I am completely stumped since the only moving variable is the emscripten version (apart from changing the EXPORTED_RUNTIME_METHODS part, where I tried both versions)? Thank you for your time in considering this request.