konsoletyper / teavm

Compiles Java bytecode to JavaScript, WebAssembly and C
https://teavm.org
Apache License 2.0
2.55k stars 260 forks source link

[WASM] Dynamic linking #926

Closed TrOllOchamO closed 2 weeks ago

TrOllOchamO commented 3 weeks ago

Hello ! I'm currently using a java library that compile to a quite large WASM blob. Since I planned to use this library in multiple modules, I was wondering if it was possible to achieve something like dynamic linking with TeaVM. The goal would be to have a single module that act as a library on which other WASM instances could rely to call utilities functions instead of bringing the utility code with them. Is it possible ? Thanks in advance, --Barnabé

konsoletyper commented 3 weeks ago

Do you mean per-jar or per-class Java -> Wasm translation? No, it's not possible, and I hope will never been. One option is to translate this library into WebAssembly module with exported methods and then, from particular applications, to import it using @Import annotations instead of direct call to library classes.

TrOllOchamO commented 3 weeks ago

Humm... I'm not sure to understand your answer, and since I'm not sure my question was well asked, I'll try to explain what I want to achieve with an example to illustrate.

Let's say I have this java class that compile to a large binary blob that we will call large.wasm :

public class JsonSerializer {
    @Export(name = "jsonStringToMap")
    public Map jsonStringToMap(String json) { /* some code */ }

    @Export(name = "jsonMapToString")
    public String jsonMapToString(Map json) { /* some code */ }
}

And another java class from another module that compile to small.wasm :

public class Main {
    @Import(name = "jsonStringToMap", module = "JsonSerializer")
    public static native Map jsonStringToMap(String json);

    public static void main(String[] args) {
        var jsonAsMap = jsonStringToMap("{\"key\": \"value\"");
        // do something with the map
    }
}

In the JS client I would instantiate large.wasm

and then instanciate small.wasm like so :

const options = {
  installImports: (importObj, _controller) => {
    importObj.JsonSerializer = {
      jsonStringToMap: teavm.instance.exports.jsonStringToMap, // the exported function from large.wasm
    };
  }
};
const teavmInstance = await TeaVM.wasm.load(wasmPath, options);

Could something like this work or is there something wrong about it ? Thanks for your response, --Barnabé

TrOllOchamO commented 2 weeks ago

I tried the above solution, but I keep getting an error saying there is a signature mismatch when I import the large.wams function's during the instantiation of the small.wasm. I spend the last hour and a half searching for the mismatch, so I wanted to be sure it's really supposed to work or am I getting misled by a side effect error message. Thanks by advance, --Barnabé

TrOllOchamO commented 2 weeks ago

Ok, after taking a look at the generated wast files, I might have a clue.

In my first project, I have this function that I want to export :

@Export(name = "ValidValueTransmitter_getValidValueRequest")
public ValidValueDialogPostRequest getValidValueRequest(Form formSpec) {
  return new ValidValueTransmitter().getValidValueRequest(formSpec);
}

It gets compiled to this :

;; The generated signature of the function I want to export
(func $ValidValueTransmitter_getValidValueRequest (export "ValidValueTransmitter_getValidValueRequest") (type $type2) 
  ;; ...
)

In my other project I want to import the above function so I do it like so :

@Import(name = "getValidValueRequest", module = "ValidValueTransmitter")
public static native ValidValueDialogPostRequest getValidValueRequest(Form formSpec);

And it produce the wast below :

;; The generated signature of the function that I should import
(func $getValidValueRequest (import "ValidValueTransmitter" "getValidValueRequest") (type $type0))

I belive the mismatch comes from the From class being compiled to type2 in the fisrt project and type0 in the second one. If so, is there a way to force a class to be associated with a given type id ? sry for posting so much ^^' Thanks for your time, --Barnabé

konsoletyper commented 2 weeks ago

You misread this. Function type is not its parameter type. Function types are anonymous, $type0 and $type2 are just aliases. You need to decompile both modules (for example, using wabt) and find actual values for $type0 and $type2. No, it's not possible to assign type ids in TeaVM, and it does not make sense.

TrOllOchamO commented 2 weeks ago

Ho yeh my bad... So in the wat format, the exported function signature looks like this :

(func (;248;) (type 2) (param i32 i32) (result i32)
  ;; ...
)

but the signature of the imported function correspond to this signature :

  (type (;0;) (func (param i32) (result i32)))

So here is the mismatch, the exported function takes two i32 as arguments but only one argument when imported :thinking: Any ideas why ? Thank again, --Barnabé

konsoletyper commented 2 weeks ago

The first method is not declared as static, which makes corresponding WebAssembly to take "instance" parameter.

TrOllOchamO commented 2 weeks ago

Yepee ! It works ! :confetti_ball:

One last thing, If the body of the exported function manipulate a static array, then once imported elsewhere, this function does not necessarily have a static array to manipulate anymore, right ? So what kind of situation I might expect ? Memory corruption ? An unreachable ? Or is the instance smart enough to somehow know the imported function need a static array and create one at instantiation step ?

Thank you so much for your patience ! --Barnabé

konsoletyper commented 2 weeks ago

Whether function imported is not related to its ability to work with static arrays. So I don't really understand your question.

The thing is following: TeaVM represents objects as bytes in heap and pointer to objects is just a number which indicates offset in heap at which object can be found. TeaVM can and will eventually move objects in heap during defragmentation, so if you have an imported method that returns pointer to a single object, you will receive different values on repeated calls.

The following is also true. Consider you have a JavaScript like this:

let ptr = callJavaMethodA();
doSomething();
callJavaMethodB();

If doSomething does not perform invocations to Java, then ptr will be valid. As execution reaches callJavaMethodB, ptr becomes invalid.

As for memory corruption, its effect is quite unpredictable. You can end up with memory corruption never have visible side effect, or corrupted heap can survive several GC, and then either your program starts behaving incorrectly or VM crashes completely or VM just hangs. There's no way to detect such situation, except for you take heap snapshot every time you make unsafe invocation and verify new state of heap against this snapshot (which is slow).

Thank you so much for your patience !

A thank will not buy me a coffee. Donation will.

TrOllOchamO commented 2 weeks ago

My question was not well asked, what I was wondering is how behave a function like this when exported :

public class Bar {
  private static int[] myStaticArray;

  @Export(name = "doesSomethingToAStaticArrayAndReturnAnObject")
  public static Map doesSomethingToAStaticArrayAndReturnAnObject() {
    // modify myStaticArray
    // create a Map<..>
    return theMap;
  }
}

Let assume this code get compiled to export.wasm and the function is imported in the import.wasm

My question is, how the doesSomethingToAStaticArrayAndReturnAnObject function behave when used in the import.wasm blob who imported it ?

1) Does import.wasm will have his own static array created when importing the function ? Or will the function modify the static array contained in the export.wasm when called ? Or will it access memory in a random spot in the import.wasm memory (UB) ?

2) Same question for the returned Map object, does it get created in the import.wasm memory or in the export.wasm memory ? I'm not sure what it technically means to use an exported function on the memory aspect.

Don't worry, I have definitively planned to buy you a drink, with all the questions you answered it is well deserved :joy: You also will be in the thanks section of my internship report (even if this won't help with coffee addiction)

Thanks, --Barnabé

konsoletyper commented 2 weeks ago

My question is, how the doesSomethingToAStaticArrayAndReturnAnObject function behave when used in the import.wasm blob who imported it ?

All classical WebAssembly functions consume and produce merely numbers. And it can be that we interpret these numbers as addresses in some WebAssembly heaps. When you have two separate modules each instantiated, they have each own heap. So import.wasm won't get direct access to objects or arrays from export.wasm. I guess in this case you should write some intermediate JavaScript code that simply copies data from one heap into another.

In more recent versions of WebAssembly there's multi-heap proposal, so in theory it would be possible to implement off-heap buffers and use them to exchange data between VMs, but since there's GC spec, I would rather invest my time into it, so it would be possible to communicate between two modules via JSO.

TrOllOchamO commented 2 weeks ago

Get it ! thx :+1:

Zireael07 commented 2 weeks ago

@TrOllOchamO You might want to look into WebAssembly components proposal