Closed monoclex closed 4 years ago
Hi Josh, great questions! First off, I'm sorry that witx docs are lacking - the language and tooling has been evolving, and I've been too busy to write any. The short story is, witx types resemble C more than Rust right now, for historical reasons - the original spec of WASI snapshot 0 was a C header, and we wanted to keep ABI compatibility when we changed the canonical description to be witx. We expect this to evolve when the Interface Types spec and toolchain support lands - many of the same people working on WASI are working on that right now.
Witx enum
s are c-style - they do not have payloads, they are just an integer where only a subset of values (counting up from 0) are valid. The integer type specified in a union declaration determines its representation in memory. So, clockid
is a u32
in memory, whereas errno
is a u16
. (In the particular case of clockid
, its wasteful to use 4 bytes to represent 4 variants, but thats what we got stuck with for compatibility.)
Witx union
s are for rust-style enums that have different types for each variant. The union
requires a witx enum
type be provided for use as the discriminant. Unions are laid out like a C struct { the_variant_enum tag; union { variant_type_1 variant1; ... } u; }
.
If you're using the witx
crate to parse and validate, the Layout
trait on AST nodes will provide the size and alignment of any datatype, and there are special UnionLayout
and StructMemberLayout
types that describe the interiors of unions and structs.
Thanks for the detailed information! Is there a place I can contribute the information you mentioned (with an obligatory ASCII chart of the byte layout of course) to the docs somewhere? I'd hate for this information to be buried in a Github issue :P
On another note, it seems that fd
is lacking as well - I've found some sample code which passed in 1
to fd_write
in place of the fd
parameter to write to stdout
, yet the fd
type mentions nothing about standard out, and there's even an odd empty Subtypes category. I think I may have found a version of the WASI C Header file you mentioned and tried to find something relating to stdout
, but doing so doesn't bring up any related #DEFINE constants either. Are these types documented somewhere as well? (There's definitely other areas of the document I'd've missed out on looking into, but all I did was gloss over the file and try to figure out how one could get a hello world up and running)
That is another great question - stdin
, stdout
, and stderr
are the first three fds that are preopened. Theyre distinct from other fd preopens that are actually directory fds (they map to a path prefix, not a file).
The subtypes category is a different thing from that concept, the idea there is there will be a type hierarchy (subtyping relationships) between handles, so we can eventaully use the type system to distinguish between directory fds, file fds, stream fds, and so on. That whole aspect of witx and implementations is not fleshed out at all, and probably should not have been added until it was actually going to be used.
Not all #define constants got added to the witx spec - I think there's one weird one about dircookie but the int
type is supposed to represent integers that have certain defined constants
As for docs, I think making a markdown file about witx in the /design/
directory of this repo is the way to go. Thank you!
I submitted some beginner quality PRs to describe the layout of witx types in #318, and to draft a document helping explain file descriptors in #319. I chose to do nothing about the subtypes, since it was just something added early. Closing this issue as further resolutions to this can take place in their respective PRs.
For the
clockid
enum, it seems to be anEnum(u32)
with four variants. Is theu32
used to represent the value of the clock? The size/alignment is 4, but if the clock reports 4 bytes surely there must be another byte to store the variant? Whilst I understand Rust enums conceptually, to me this doesn't make sense at how I'd implement this in a cross platform manner for other languages. Can this be clarified?