bytecodealliance / wasmtime

A fast and secure runtime for WebAssembly
https://wasmtime.dev/
Apache License 2.0
15.4k stars 1.3k forks source link

Port the JS WASI polyfill to wasi-common #520

Closed sunfishcode closed 1 year ago

sunfishcode commented 5 years ago

Wasmtime has a JS polyfill implementation of WASI:

https://github.com/CraneStation/wasmtime/tree/polyfill/wasmtime-wasi/js-polyfill

which works by compiling wasmtime's C implementation of WASI with Emscripten, and then adding a bit of support code on top of that.

We should port the support code to work on top of wasi-common, compiled with wasm32-unknown-emscripten. In theory, all the big pieces are in place -- wasi-common provides the same functionality as Wasmtime's C implementation of WASI, Emscripten can provide the same underlying support in both cases, and the support code should roughly be the same. So the work here is in rewiring everything up in wasi.js, and porting polyfill.c to Rust, probably splitting out that EM_ASM into a standalone JS file.

I've marked this "help wanted" -- if someone's interested in doing this port, that'd be great (and I can help mentor), but it'd even help just to have more ideas for packaging, binding, or other ways we could make the polyfill better. The current support code and shell.html is super minimal.

dingxiangfei2009 commented 5 years ago

@sunfishcode I would like to indicate my interest in helping here.

I have a question here. I have been experimenting with file descriptor syscalls in the current js-polyfill. Since WASM has yet to support 64-bit integers, WASI guests cannot make direct calls to functions like __wasi_fd_seek because __wasi_filedelta_t is 64-bit. So how would one make these calls? Currently I have a shim on top of these functions and split these 64-bit arguments up into two halves. It works, albeit being ugly.

dingxiangfei2009 commented 5 years ago

Okay, so I want to check my understanding with you here. In order to implement the same polyfill in wasi-common, we need to cross-compile wasmtime-wasi crate in https://github.com/CraneStation/wasmtime. Especially important is this module https://github.com/CraneStation/wasmtime/blob/5164994ce1c6e76af482906357f2ad3e07ddad35/wasmtime-wasi/src/syscalls.rs#L20. We need to swap out the references to VM with emscripten implementation.

sunfishcode commented 5 years ago

Concerning WASI's use of 64-bit ints: in the long term, JS engines will support this via bignums. In the short term, we're hoping that shims can be sufficient. The shims may get a little ugly, but I don't know of any better solution, short of changing the WASI APIs, which would also be ugly.

The wasmtime-wasi crate is just glue to connect the WASI implementation with Wasmtime's runtime data structure, so I don't think you need it.

I think the thing to start with is compiling wasi-common with wasm32-unknown-emscripten. That should give you a library containing C symbols prefixed with wasi_common_, such as wasi_common_fd_read. These are generated by a macro, wasi_common_cbindgen, so they should have C-like API compatible with WASI's external C API, except for the addition of the WasiCtx and memory arguments to functions which need access to extra state.

The next step is to start adapting the current polyfill's wasi.js to call that library. I think the main things to do are:

dingxiangfei2009 commented 5 years ago

@sunfishcode Thank you very much for the clear instruction.

If I understand correctly, we are relying on the nix crate for the syscalls. Unfortunately nix @ 0.14.1 has yet to support cfg(target_os="emscripten"). I will prepare a patch to nix for that first, if you don't mind.

kubkon commented 5 years ago

Hey @dingxiangfei2009, some (if not most) of our syscalls impl does indeed depend on the nix crate, although we are now working towards reusing as much Rust's libstd as possible. This way, libstd will take care of a lot of platform-specific tricky details for us :-)

dingxiangfei2009 commented 5 years ago

@kubkon I see. I assume that CraneStation/wasi-common#16 is related as well, so I will watch that list of functions.

Also, would you mind giving some hints on which function can reuse libstd? I have, for example, investigated host calls in fs.rs, but it would be demanding to switch to libstd since they are handling raw file descriptors while libstd does not accept raw file descriptors in most cases.

kubkon commented 5 years ago

@kubkon I see. I assume that CraneStation/wasi-common#16 is related as well, so I will watch that list of functions.

Also, would you mind giving some hints on which function can reuse libstd? I have, for example, investigated host calls in fs.rs, but it would be demanding to switch to libstd since they are handling raw file descriptors while libstd does not accept raw file descriptors in most cases.

That's true, but as it turns out, we don't need to operate on raw descriptors at all times anymore. Instead, it's preferable to use std::fs::File which is essentially a wrapper around a raw FD or Windows Handle. And so, to provide a nice example, fd_datasync previously did a call to either nix::unistd::fdatasync or nix::unistd::fsync depending on your nix platform, whereas now it uses std::fs::File::sync_data which internally, on a nix platform calls one of the above for us, and, as an added bonus, it also performs the correct syscall on Windows.

So all in all, the current trend is to move away from relying on raw descriptor manipulation as much as possible (of course, sometimes, like in the case of path manipulation, it might not be possible).

I hope that made it a bit clearer. If not, fire away and I'll try to explain it more! :-)

dingxiangfei2009 commented 5 years ago

@kubkon Thanks for the explanation, and I managed to get some findings after poking around. I have looked deeper into src/hostcalls_impl module and I have not find a syscall that can be switched over. Some of them do not have windows support even, where unimplemented! was filled in at poll_oneoff, clock_res_get, etc.

Access mode of file descriptors look promising and OpenOptions could be providing the necessary information. However, a File does not expose them after it is created.

Maybe I am overlooking something. Would you mind giving a direction here, like exactly which syscall can be "simplified" with libstd?

Speaking of overlooking stuff, I did find that fd_filestat_set_size can be adapted to File::set_len instead. Sorry! Anyway, I am eager to hear suggestions about other "simplification" opportunities!

dingxiangfei2009 commented 5 years ago

Also, since emscripten target appears to be compatible with linux target anyway, wouldn't it be easier to just patch nix to support target_os="emscripten"? Of cause, I could be very wrong here about this and please correct me here if so.

kubkon commented 5 years ago

@dingxiangfei2009 patching nix is a good idea in and of itself, that's for sure, so please feel free to do so -- it'll be a very worthwhile contribution, I'm sure of it :-) As far as our use of nix in wasi-common is concerned, perhaps it'll help if I outline what is where in the project (I probably should've done that in the first place, apologies!):

All in all, I think that your plan of patching nix first is a good idea (let me know if you need any help with that!). Afterwards, we can start figuring out how to enable emscripten target in wasi-common and slowly start porting over things mentioned by @sunfishcode. How's that sound everyone @dingxiangfei2009 @sunfishcode? :-)

kubkon commented 4 years ago

After putting in some thought into this and doing a little experiment with nix, patching up the nix crate so that it compiles for the wasm32-unknown-emscripten target is not such a trivial task after all as I originally assumed. It is doable, however it will not be immediately testable which IMHO makes this task infinitely more tedious and convoluted than originally planned. Then, even after it is patched, it will probably take some (long?) time for it to make it to a release, and I'd really like to have wasi-common compile to Emscripten sooner rather than later.

My proposition here is that we remove nix as a dependency altogether and write our own wrappers around the libc functions - after all, we've already kinda started giving up on nix anyhow when @marmistrz started refactoring fd_readdir on Linux and implementing it on Windows. Besides, we don't actually need that much of nix's machinery; we're only ever using a couple of wrappers which should be trivially wrapped by us. I'm happy to do the heavy-lifting here myself, but would appreciate code review and later help in refactoring bits here and there. As an added bonus of this approach, we should be able to wrap up all uses of libc functions we currently use in wasi-common, making the implementation so much cleaner rather than like now, when some use raw libc and some rely on nix.

cc @sunfishcode @marmistrz @peterhuene

kubkon commented 4 years ago

Actually, I'd appreciate your feedback @alexcrichton on this as well.

marmistrz commented 4 years ago

If nix is an obstacle, then I think dropping the dependency on nix is a sound decision, especially that we have a slightly different understanding of unsafe than the nix guys.

Could we possibly keep the Rusty wrappers in a separate crate, similar to winx, and publish it on crates.io? I propose the name: younix!

kubkon commented 4 years ago

I agree on keeping the wrappers in a separate crate :-)

alexcrichton commented 4 years ago

This sounds like a reasonable solution to me, it's likely inevitable that there's a sys/* style hierarchy as libstd has, and emscripten would be "Just Another Platform" that can be targeted.

sunfishcode commented 4 years ago

Me too :slightly_smiling_face:

kubkon commented 4 years ago

It’s settled then! I’ve already started prepping some ground work and should be able to deliver a working (not necessarily final mind you) solution, i.e., wasi-common nix-free, soon-ish.

@marmistrz, how about yanix for a name? Yet another nix crate... 😉

marmistrz commented 4 years ago

@kubkon yanix is missing the wordplay (and I love wordplays! :)), but that's fine for me too.

sunfishcode commented 1 year ago

Closing this old issue, as the JS polyfill referenced here is no longer in use.