Closed sunfishcode closed 1 year ago
@sunfishcode I would like to indicate my interest in helping here.
I have a question here. I have been experimenting with file descriptor syscalls in the current js-polyfill
. Since WASM has yet to support 64-bit integers, WASI guests cannot make direct calls to functions like __wasi_fd_seek
because __wasi_filedelta_t
is 64-bit. So how would one make these calls? Currently I have a shim on top of these functions and split these 64-bit arguments up into two halves. It works, albeit being ugly.
Okay, so I want to check my understanding with you here. In order to implement the same polyfill in wasi-common
, we need to cross-compile wasmtime-wasi
crate in https://github.com/CraneStation/wasmtime. Especially important is this module https://github.com/CraneStation/wasmtime/blob/5164994ce1c6e76af482906357f2ad3e07ddad35/wasmtime-wasi/src/syscalls.rs#L20. We need to swap out the references to VM with emscripten
implementation.
Concerning WASI's use of 64-bit ints: in the long term, JS engines will support this via bignums. In the short term, we're hoping that shims can be sufficient. The shims may get a little ugly, but I don't know of any better solution, short of changing the WASI APIs, which would also be ugly.
The wasmtime-wasi
crate is just glue to connect the WASI implementation with Wasmtime's runtime data structure, so I don't think you need it.
I think the thing to start with is compiling wasi-common with wasm32-unknown-emscripten. That should give you a library containing C symbols prefixed with wasi_common_
, such as wasi_common_fd_read
. These are generated by a macro, wasi_common_cbindgen
, so they should have C-like API compatible with WASI's external C API, except for the addition of the WasiCtx
and memory
arguments to functions which need access to extra state.
The next step is to start adapting the current polyfill's wasi.js to call that library. I think the main things to do are:
___wasi_*
functions to wasi_common_*
functions (or maybe _wasi_common_*
, because Emscripten prepends _
to external names?)WasiCtx
and memory
arguments as needed (see the signatures in src/hostcalls in wasi-common)fd_table_init
, to allocate the WasiCtx
and memory.@sunfishcode Thank you very much for the clear instruction.
If I understand correctly, we are relying on the nix
crate for the syscalls. Unfortunately nix
@ 0.14.1
has yet to support cfg(target_os="emscripten")
. I will prepare a patch to nix
for that first, if you don't mind.
Hey @dingxiangfei2009, some (if not most) of our syscalls impl does indeed depend on the nix
crate, although we are now working towards reusing as much Rust's libstd
as possible. This way, libstd
will take care of a lot of platform-specific tricky details for us :-)
@kubkon I see. I assume that CraneStation/wasi-common#16 is related as well, so I will watch that list of functions.
Also, would you mind giving some hints on which function can reuse libstd
? I have, for example, investigated host calls in fs.rs
, but it would be demanding to switch to libstd
since they are handling raw file descriptors while libstd
does not accept raw file descriptors in most cases.
@kubkon I see. I assume that CraneStation/wasi-common#16 is related as well, so I will watch that list of functions.
Also, would you mind giving some hints on which function can reuse
libstd
? I have, for example, investigated host calls infs.rs
, but it would be demanding to switch tolibstd
since they are handling raw file descriptors whilelibstd
does not accept raw file descriptors in most cases.
That's true, but as it turns out, we don't need to operate on raw descriptors at all times anymore. Instead, it's preferable to use std::fs::File
which is essentially a wrapper around a raw FD or Windows Handle. And so, to provide a nice example, fd_datasync previously did a call to either nix::unistd::fdatasync
or nix::unistd::fsync
depending on your nix platform, whereas now it uses std::fs::File::sync_data
which internally, on a nix platform calls one of the above for us, and, as an added bonus, it also performs the correct syscall on Windows.
So all in all, the current trend is to move away from relying on raw descriptor manipulation as much as possible (of course, sometimes, like in the case of path manipulation, it might not be possible).
I hope that made it a bit clearer. If not, fire away and I'll try to explain it more! :-)
@kubkon Thanks for the explanation, and I managed to get some findings after poking around. I have looked deeper into src/hostcalls_impl
module and I have not find a syscall
that can be switched over. Some of them do not have windows support even, where unimplemented!
was filled in at poll_oneoff
, clock_res_get
, etc.
Access mode of file descriptors look promising and OpenOptions
could be providing the necessary information. However, a File
does not expose them after it is created.
Maybe I am overlooking something. Would you mind giving a direction here, like exactly which syscall can be "simplified" with libstd
?
Speaking of overlooking stuff, I did find that fd_filestat_set_size
can be adapted to File::set_len
instead. Sorry! Anyway, I am eager to hear suggestions about other "simplification" opportunities!
Also, since emscripten
target appears to be compatible with linux
target anyway, wouldn't it be easier to just patch nix
to support target_os="emscripten"
? Of cause, I could be very wrong here about this and please correct me here if so.
@dingxiangfei2009 patching nix
is a good idea in and of itself, that's for sure, so please feel free to do so -- it'll be a very worthwhile contribution, I'm sure of it :-) As far as our use of nix
in wasi-common
is concerned, perhaps it'll help if I outline what is where in the project (I probably should've done that in the first place, apologies!):
src/hostcalls_impl
due host-specific impl (e.g., usage of openat
or readlinkat
from nix
crate is one example), or we haven't got to that yet but are working on itAll in all, I think that your plan of patching nix
first is a good idea (let me know if you need any help with that!). Afterwards, we can start figuring out how to enable emscripten
target in wasi-common
and slowly start porting over things mentioned by @sunfishcode. How's that sound everyone @dingxiangfei2009 @sunfishcode? :-)
After putting in some thought into this and doing a little experiment with nix
, patching up the nix
crate so that it compiles for the wasm32-unknown-emscripten
target is not such a trivial task after all as I originally assumed. It is doable, however it will not be immediately testable which IMHO makes this task infinitely more tedious and convoluted than originally planned. Then, even after it is patched, it will probably take some (long?) time for it to make it to a release, and I'd really like to have wasi-common
compile to Emscripten sooner rather than later.
My proposition here is that we remove nix
as a dependency altogether and write our own wrappers around the libc
functions - after all, we've already kinda started giving up on nix
anyhow when @marmistrz started refactoring fd_readdir
on Linux and implementing it on Windows. Besides, we don't actually need that much of nix
's machinery; we're only ever using a couple of wrappers which should be trivially wrapped by us. I'm happy to do the heavy-lifting here myself, but would appreciate code review and later help in refactoring bits here and there. As an added bonus of this approach, we should be able to wrap up all uses of libc
functions we currently use in wasi-common
, making the implementation so much cleaner rather than like now, when some use raw libc
and some rely on nix
.
cc @sunfishcode @marmistrz @peterhuene
Actually, I'd appreciate your feedback @alexcrichton on this as well.
If nix is an obstacle, then I think dropping the dependency on nix is a sound decision, especially that we have a slightly different understanding of unsafe
than the nix guys.
Could we possibly keep the Rusty wrappers in a separate crate, similar to winx
, and publish it on crates.io? I propose the name: younix
!
I agree on keeping the wrappers in a separate crate :-)
This sounds like a reasonable solution to me, it's likely inevitable that there's a sys/*
style hierarchy as libstd has, and emscripten would be "Just Another Platform" that can be targeted.
Me too :slightly_smiling_face:
It’s settled then! I’ve already started prepping some ground work and should be able to deliver a working (not necessarily final mind you) solution, i.e., wasi-common
nix
-free, soon-ish.
@marmistrz, how about yanix
for a name? Yet another nix
crate... 😉
@kubkon yanix
is missing the wordplay (and I love wordplays! :)), but that's fine for me too.
Closing this old issue, as the JS polyfill referenced here is no longer in use.
Wasmtime has a JS polyfill implementation of WASI:
https://github.com/CraneStation/wasmtime/tree/polyfill/wasmtime-wasi/js-polyfill
which works by compiling wasmtime's C implementation of WASI with Emscripten, and then adding a bit of support code on top of that.
We should port the support code to work on top of wasi-common, compiled with wasm32-unknown-emscripten. In theory, all the big pieces are in place -- wasi-common provides the same functionality as Wasmtime's C implementation of WASI, Emscripten can provide the same underlying support in both cases, and the support code should roughly be the same. So the work here is in rewiring everything up in wasi.js, and porting polyfill.c to Rust, probably splitting out that
EM_ASM
into a standalone JS file.I've marked this "help wanted" -- if someone's interested in doing this port, that'd be great (and I can help mentor), but it'd even help just to have more ideas for packaging, binding, or other ways we could make the polyfill better. The current support code and shell.html is super minimal.