emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.65k stars 3.29k forks source link

Question on usage of libc functions in Wasm #15296

Closed miladfarca closed 2 years ago

miladfarca commented 2 years ago

I have a question on using libc functions such as malloc on Big Endian platforms.

Wasm is Little Endian enforced. On BE platforms, JS engines such as V8 are responsible to reverse the bytes of every load/store instruction at runtime to make sure their behaviour matches LE platforms like x64.

My question is what happens when system libraries are linked with Wasm on BE machines? take this snippet as an example:

int main(){
  int *p = (int*) malloc(sizeof(int));
 // ....
  return 0;
}

Does emscripten use the system malloc implementation at runtime to do this or it has its own implementation bundled with the final wasm binary file?

@kripken

sbc100 commented 2 years ago

The only libc that can be linked into an emscripten project is one that is already built for WebAssembly. In practice the one we use is a custom version of Musl libc which is included with emscripten itself. In other words, there is no way to link the system version of libc/malloc into your program.

miladfarca commented 2 years ago

Thank you, would you be able to point me to its implementation, is this the right place to find it? https://github.com/emscripten-core/emscripten/tree/main/system/lib/libc/musl/src/stdlib

sbc100 commented 2 years ago

Yup, technically it one level up from that but you got the right idea.

There is also a README that explains where it comes from: https://github.com/emscripten-core/emscripten/blob/main/system/lib/libc/README.md

I'm working on an update to the latest version of musl right now: https://github.com/emscripten-core/emscripten/pull/13006. As part of that I'm using an external repro to track our local change: https://github.com/emscripten-core/musl

miladfarca commented 2 years ago

Thank you, so I guess the malloc implementation is under emscripten-core/musl and not main/system/lib/libc.

Either way I think the problem remains. We have attempted an initial fix of BE platforms in this PR: https://github.com/emscripten-core/emscripten/pull/13413 It fixes the usage of TypedArrays.

My question now is, do we need to also fix the libc implantation in this repository to also load/store values in reverse on BE platforms?

sbc100 commented 2 years ago

The default allocator we use is dlmalloc: https://github.com/emscripten-core/emscripten/blob/main/system/lib/dlmalloc.c.

I don't see how malloc is any different to any other native code. How is it different to any other native function that returns a pointer?

miladfarca commented 2 years ago

Not different at all, just wanted to use malloc as example to learn what happens behind the scene.

So wasm runtime (V8 runtime) makes a call to a libc function which is also provided by emscripten. The libc implementation however is not LE enforced, I assume it passes the output back to V8 in BE order which then gets reversed by V8, i.e there is currently no bridge between libc and V8 within emscripten to reverse the values passed to each other correct?

sbc100 commented 2 years ago

I don't know how V8 runs wasm on BE architectures... does this work at all?

I don't think you need to consider malloc or libc as special since they are not.. they are just ordinary functions. You can consider a much simpler wasm function:

int foo() { return 42; }

Just like malloc, this function returns an i32.

sbc100 commented 2 years ago

Perhaps @kripken would better understand that intent behind this question?

miladfarca commented 2 years ago

Thanks for clarifying, so essentially emscripten compiles every library it needs to wasm at compile time? i.e it doesn't link to it with a raw compiled libc function and jump to it at runtime?

sbc100 commented 2 years ago

Yes, there is no way you can link any host libraries into an emscripten project. Its simply not possible.

miladfarca commented 2 years ago

Thanks again for clarifying, I will now close this issue.