Open AndrewScheidecker opened 5 years ago
In the IoT space and even more limited environments is possible that a FS-like system exists, but it implements something very different than the typical tree-based FS.
For instance, flash filesystems could be targeting raw flash, or rely on HW existing as a translation layer.
Similarly, one can envision an EEPROM-based thin layer FS that does wear leveling and only "exposes" a few files to store some calibration info for some sensors (e.g. the TooDry-TooWet markers for a moisture sensor for an automated plat watering system) for multiple units. For such a system, it could be presented as files "plant1.data", "plant2.data" etc. or they could be even lighter like inode=123, inode=789 or, even nicer, abstracted out as a file handle.
Needless to say, the string-based names make for friendly interfaces, but they would be unnecessary overhead for such a simple system. I am aware this poses a reasonable question regarding how you get from a human readable identifier to a handler, but I guess naming the "files" in the code as "1", "2", "3" could be an acceptable compromise, and it would be the WASI runtime's job to map the actual names to handlers, but the key point is: once the translation from human-readable names was done by the WASI runtime, assumptions about the implementation and what the returned type is should be limited to "we don't know, it's a black box", and the semantics could be implemented correctly by WASI runtime, whatever definition of "correctly" might apply.
I'm not sure I get all the aspects of the discussion above, but I hope this helps a little
To me the words handle
, descriptor
and reference
all seem fairly well suited to describing an opaque reference. The word descriptor
, to me, doesn't imply and integer and the a handle
doesn't imply a non-integer. So don't think the name makes much difference.
From a type system POV I think it would be nice if the first argument to file API function was not just a generic handle but some sort of file
handle. It would be nice to maintain subtyping here. It doesn't make sense to be able to pass a non-file handle (e..g a process handle) to file_read
. So even we rename to handle perhaps we could still include the word file
. e.g. __wasi_file_handle_t
?
I like the term handle
. But rather than change the name of fd
, I'd prefer to keep the fd
type, and have its kind (meaning, the type of the type) be a new concept called handle
: see #117.
This way, the existing fd
apis can continue to take an $fd_t
argument, with no change to the representation of the argument, but as we add other handle types, we retain some type information that can be used to reject incorrect uses at runtime, or that can be used to generate type-safe interfaces (for languages that support that sort of type safety - likely not C) from the Witx description.
We should then make a new error enum value EBADHANDLE
to specifically represent that the wrong type of handle was used, whereas EBADF
indicates that the fd-typed handle provided is not suitable for some other reason.
We could then discuss refactoring the path apis to use a subtype of fd
called dirfd
to make that distinction formal in the type system - we already do this check at runtime - or maybe redesign a bit to make path
and fd
two distinct types of handles.
In the presentation I gave at the WASI http meeting yesterday, I proposed introducing a handful of types of kind handle (including polymorphic types!), which would be represented using the same u32 index space as fd. However, in that presentation, I called them cap
as short for capability
, and used a syntax that was easier to read on the slides than witx. I've been talked into believing handle
is a better name than cap
or capability
. I prefer handle
over descriptor
because the word is shorter to say and type, but my opinion on that is not very strong.
I'm not a fan of calling them reference
, because that name is already used for a closely related concept in the WebAssembly reference types proposal. When we are able to specify WASI in terms of interface types, we will have the flexibility to use either a typed reference, or an integer index (same as now), as the concrete representation for the abstract handle
, depending on what the consumer of the interface wants.
Here's my take on why this naming is a problem:
The problem is not whether it's descriptor
or handle
or reference
, the problem is that it is explicitly a file
, when it is actually a generic resource
.
So maybe we could call it rd
for "resource descriptor", or rh
for "resource handle", or just descriptor
/handle
.
As for the type system POV mentioned by @sbc100, I agree typing is good, so we should have fd
inherit from rd
as opposed to fd
being the base type.
Besides the name (I like both descriptor
and handle
!), I agree with this change because it makes some kinds of WASI hosts simpler:
They don't have to manage a table mapping an FD (currently u32
) to the object representing I/O stream anymore.
I suppose (and am implementing) a WASI host app which forwards the output to its view when WASI modules write some bytes onto stdout/stderr.
In such apps, there's only objects which can receive/send byte strings and do something with them (e.g. OutputStream
and InputStream
in Java, console.log
in the browser etc.), while there are no mappings from FDs to such objects.
Because they are neither files nor sockets, managed by the host's OS.
So, to implement such apps, we have to create a table which maps an FD to the I/O stream objects by ourselves. I think that is an extra task.
But one thing to note: in addition to this change, we should add stdin
/ stdout
/ stderr
to WASI as a handle
. Because they can't be directly referred by wasm module if they are handle
s (similar to anyref
) instead of u32
s.
The current design has a hidden table of implementation-defined objects that are indexed by the integer "file descriptors" exposed through the API.
These "file descriptors" could be more generally called handles, and the table of implementation-defined objects generalized to include other kinds of objects: processes, threads, mutexes, clocks, etc.
This would allow, for example, WASI APIs that work with process objects without exposing a global process ID namespace, and without adding a separate "process descriptor" table that maps indices to process objects.
The impact on the current API would be pretty minimal:
And everywhere the API currently takes a
__wasi_fd_t
would be changed to take a__wasi_handle_t
.EBADF
could be aliased asEBADHANDLE
, and returned when using an invalid handle.ENOTCAPABLE
could returned when using an API function with a handle that references the wrong kind of object (as presumably the handle will not have the capabilities corresponding to the API function), or a distinct error code could be added.This would apply the same if the integer handles are replaced by
anyref
:typedef anyref __wasi_handle_t