Closed qianxichen233 closed 2 months ago
Can you expand upon where you're seeing performance degradation?
So when I run wasmtime run fork.wasm
, there will be a 5-6 seconds delay before the module is actually executing. The delay comes from wasmtime calling "compiling module" method, whose performance I think is directly related to the size of the binary.
btw, wasmtime has a cache feature for this, so if I run the module for the second time, it will start execute immediately without compiling the module again
I'd like us to explore if we can do this ahead of time and not pay this cost at load time.
yeah, I think it should be possible to do some pre-compile to the wasm module, since I just took a look into wasmtime and I noticed that they are actually detecting if the passed module is pre-compiled or not. And it will go to the compile module path only if it is not pre-compiled, otherwise it will just deserialize the object. I currently still didn't find how to pre-compile the module in wasmtime, but if this stuff exists in the code, then I think there must be a way to do it
I just figured out how to do pre-compile to wasm module. We can just run wasmtime compile
to further compile the wasm file into cwasm file. And running the cwasm file will skip the compilation and started to execute immediately, so I guess the delay issue could be considered as resolved?
However, I still think the binary size is an issue but I am not sure if we care about this
Great! I think your could close and open a new issue. Or, if you want, just revise this issue perhaps by crossing out parts in the description that don't matter to make it clear what part you've fixed.
Our compiled wasm binary size is almost 10 times larger than the binary size compiled from same c program in wasi-libc/wasix-libc. This becomes obvious after we start to use Binaryen’s Asyncify for multi-processing calls as it will further expand the binary size by 3-4 times, which would start to cause some noticeable delay when running the wasm module. (wasix-libc also used Asyncify for multi-processing and therefore expanded the binary size by 3-4 times, but since their base binary size is small, so no obvious delay can be observed)
For example, for fork.c:
Another thing I've noticed is that the binary size in lind-wasm increases dramatically when we use some of the functions. For example, there are two c program, write.c and printf.c, and both of them are printing a message to stdout and the only difference is that one is using write syscall directly and another is using printf. And below is the number of lines in the compiled WAT file: printf.wat: 450604 lines write.wat: 9782 lines
My current thinking is that our binary size is too large because we included too many unnecessary stuff in glibc, like there are a lot of functions that are not working in lind-wasm, but we still included that into the binary and let user to call them.