Closed rennergade closed 1 week ago
Why not using Wasmer? :)
Some updates/docs for fork: We have successfully implemented fork inside wasmtime, using the same approach from Wasmer. Detailed steps for current fork implementation:
Besides, we also have a very basic support for forking inside multi-threading environment. The way memory of threading works in wasmtime is that the wasm module must declare the memory as imported and shared when compiled, so that wasmtime could create a new shared memory and put it into the Linker struct. In this way, whenever the wasm module wants to access the memory, it will look for the imported memory from the Linker. And since the linker is shared between threads, the memory is also inherently shared. In order to make this system working with fork, we cloned the Linker for the new forked instance, and replaced the imported memory with a new memory. However, there is still issues in fork interacting with threads.
Great job at getting most of this together. One thing I want to hark back on is what the title of this issue suggests, which is fork should be called by clone() syscall out of libc, which also should be the call for creating threads. We should use basically the same semantics as clone() in Linux.
Blueprint:
When I was testing some multi-threading program with large buffer involves, I realized there is an minor issue with our approach of using asyncify to implement thread as a shared-memory version of fork. So basically creating a thread is like doing a fork, but the memory will not be copied and they are using the same memory. And a new stack address is assigned to thread for its use to it won't conflict with the parent's stack. Now the issue is that, the unwind/rewind process will actually save the stack pointer and restore it back after rewinding. This makes the thread's stack pointer forcely restored back to parent's stack pointer even if I already set the child's stack pointer before. Currently I can think of two solutions:
global.set __stack_pointer
with drop
), and this seems to be working fine right now. However, although the tests seems to be passed, I am not very sure what exactly could happen using this approach. And this would add another step of compliation into our compliation process.I'd recommend asking on Zulip to understand more about why it was done in this way in the first place. Your solution seems logical, but I also worry about what is now broken.
On Thu, Oct 3, 2024 at 1:05 PM Qianxi Chen @.***> wrote:
When I was testing some multi-threading program with large buffer involves, I realized there is an minor issue with our approach of using asyncify to implement thread as a shared-memory version of fork. So basically creating a thread is like doing a fork, but the memory will not be copied and they are using the same memory. And a new stack address is assigned to thread for its use to it won't conflict with the parent's stack. Now the issue is that, the unwind/rewind process will actually save the stack pointer and restore it back after rewinding. This makes the thread's stack pointer forcely restored back to parent's stack pointer even if I already set the child's stack pointer before. Currently I can think of two solutions:
- the first solution is to modify the copied unwind data for child (unwinding process will generate an unwind data, which is used to do the rewinding) to make the stack pointer stored in unwind data becomes the new stack pointer. This approach might be a little bit hard, since the unwind data is a raw binary, so it is hard to tell where is the stack pointer stored inside it.
- the second solution is to modify the asyncified wasm module directly, which is what I am currently using. I tried to remove the instruction where stack pointer is restored (replace global.set __stack_pointer with drop), and this seems to be working fine right now. However, although the tests seems to be passed, I am not very sure what exactly could happen using this approach. And this would add another step of compliation into our compliation process.
— Reply to this email directly, view it on GitHub https://github.com/Lind-Project/lind-wasm/issues/10#issuecomment-2391908826, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGRODZOJ7NCWORT6BYXQBDZZV2NPAVCNFSM6AAAAABMG7CVHSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOJRHEYDQOBSGY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Yeah this approach is not stable and it can crash in some case. So I tried with the first approach and dived deeper into how asyncify modifies the wasm code by reading through its wat file, and I figured out that it looks like the stack pointer is set by a fixed variable in unwind data. I tried to replace the value of the variable in unwind data and now the stack pointer seems to be set correctly and the tests is running stably. However, there is still one remaining question about the offset to that fixed variable. I am not sure if the offset is a fixed value, or it may change for some other program. I've tried several test cases and it looks like 0xc is working for all of them. I am not sure if people on Zulip would like to answer question about Asyncify because that is not something wasmtime is using
In addition to creating threads via clone() we also want to fork processes. Wasmtime doesn't have this capability built in but wasmer seems to have it: https://github.com/wasmerio/wasmer/blob/eb9127036add9f2a174ec1623600ebcc802fed6f/lib/wasix/src/syscalls/wasix/proc_fork.rs#L23
Can we look into what they're doing here and see if we can recreate this as an add-on module to wasmtime.