WebAssembly / WASI

WebAssembly System Interface
Other
4.89k stars 255 forks source link

Rename FD to "handle" #62

Open AndrewScheidecker opened 5 years ago

AndrewScheidecker commented 5 years ago

The current design has a hidden table of implementation-defined objects that are indexed by the integer "file descriptors" exposed through the API.

These "file descriptors" could be more generally called handles, and the table of implementation-defined objects generalized to include other kinds of objects: processes, threads, mutexes, clocks, etc.

This would allow, for example, WASI APIs that work with process objects without exposing a global process ID namespace, and without adding a separate "process descriptor" table that maps indices to process objects.

The impact on the current API would be pretty minimal:

typedef struct __wasi_fdstat_t
{
    __wasi_filetype_t fs_filetype;
    __wasi_fdflags_t fs_flags;
-   // __wasi_rights_t fs_rights_base;
-   // __wasi_rights_t fs_rights_inheriting;
} __wasi_fdstat_t;
-typedef int32_t __wasi_fd_t;
-__wasi_errno_t __wasi_fd_close(__wasi_fd_t fd);
-__wasi_errno_t __wasi_fd_renumber(__wasi_fd_t from,
-                                  __wasi_fd_t to);
-__wasi_errno_t __wasi_fd_fdstat_set_rights(__wasi_fd_t fd,
-                                           __wasi_rights_t base,
-                                           __wasi_rights_t inheriting);
-__wasi_errno_t __wasi_fd_prestat_get(__wasi_fd_t fd,
-                                     __wasi_prestat_t* out_prestat);
+typedef intptr_t __wasi_handle_t;
+typedef uint32_t __wasi_object_kind_t;
+#define __WASI_OBJECT_KIND_FD (UINT32_C(0))
+__wasi_errno_t __wasi_handle_close(__wasi_wasi_handle_t handle);
+__wasi_errno_t __wasi_handle_renumber(__wasi_handle_t from,
+                                      __wasi_handle_t to);
+__wasi_errno_t __wasi_handle_set_rights(__wasi_handle_t handle,
+                                        __wasi_rights_t base,
+                                        __wasi_rights_t inheriting);
+__wasi_errno_t __wasi_handle_get_rights(__wasi_handle_t handle,
+                                        __wasi_rights_t* out_base,
+                                        __wasi_rights_t* out_inheriting);
+__wasi_errno_t __wasi_handle_get_object_kind(__wasi_handle_t handle,
+                                             __wasi_object_kind_t* out_kind);
+__wasi_errno_t __wasi_handle_prestat_get(__wasi_handle_t handle,
+                                         __wasi_prestat_t* out_prestat);

And everywhere the API currently takes a __wasi_fd_t would be changed to take a __wasi_handle_t. EBADF could be aliased as EBADHANDLE, and returned when using an invalid handle. ENOTCAPABLE could returned when using an API function with a handle that references the wrong kind of object (as presumably the handle will not have the capabilities corresponding to the API function), or a distinct error code could be added.

This would apply the same if the integer handles are replaced by anyref: typedef anyref __wasi_handle_t

dumblob commented 5 years ago

My words - see https://github.com/WebAssembly/WASI/issues/1#issuecomment-479408778 and https://github.com/WebAssembly/WASI/issues/1#issuecomment-480387566 .

eddyp commented 5 years ago

In the IoT space and even more limited environments is possible that a FS-like system exists, but it implements something very different than the typical tree-based FS.

For instance, flash filesystems could be targeting raw flash, or rely on HW existing as a translation layer.

Similarly, one can envision an EEPROM-based thin layer FS that does wear leveling and only "exposes" a few files to store some calibration info for some sensors (e.g. the TooDry-TooWet markers for a moisture sensor for an automated plat watering system) for multiple units. For such a system, it could be presented as files "plant1.data", "plant2.data" etc. or they could be even lighter like inode=123, inode=789 or, even nicer, abstracted out as a file handle.

Needless to say, the string-based names make for friendly interfaces, but they would be unnecessary overhead for such a simple system. I am aware this poses a reasonable question regarding how you get from a human readable identifier to a handler, but I guess naming the "files" in the code as "1", "2", "3" could be an acceptable compromise, and it would be the WASI runtime's job to map the actual names to handlers, but the key point is: once the translation from human-readable names was done by the WASI runtime, assumptions about the implementation and what the returned type is should be limited to "we don't know, it's a black box", and the semantics could be implemented correctly by WASI runtime, whatever definition of "correctly" might apply.

I'm not sure I get all the aspects of the discussion above, but I hope this helps a little

sbc100 commented 5 years ago

To me the words handle, descriptor and reference all seem fairly well suited to describing an opaque reference. The word descriptor, to me, doesn't imply and integer and the a handle doesn't imply a non-integer. So don't think the name makes much difference.

From a type system POV I think it would be nice if the first argument to file API function was not just a generic handle but some sort of file handle. It would be nice to maintain subtyping here. It doesn't make sense to be able to pass a non-file handle (e..g a process handle) to file_read. So even we rename to handle perhaps we could still include the word file. e.g. __wasi_file_handle_t?

pchickey commented 5 years ago

I like the term handle. But rather than change the name of fd, I'd prefer to keep the fd type, and have its kind (meaning, the type of the type) be a new concept called handle: see #117.

This way, the existing fd apis can continue to take an $fd_t argument, with no change to the representation of the argument, but as we add other handle types, we retain some type information that can be used to reject incorrect uses at runtime, or that can be used to generate type-safe interfaces (for languages that support that sort of type safety - likely not C) from the Witx description.

We should then make a new error enum value EBADHANDLE to specifically represent that the wrong type of handle was used, whereas EBADF indicates that the fd-typed handle provided is not suitable for some other reason.

We could then discuss refactoring the path apis to use a subtype of fd called dirfd to make that distinction formal in the type system - we already do this check at runtime - or maybe redesign a bit to make path and fd two distinct types of handles.

In the presentation I gave at the WASI http meeting yesterday, I proposed introducing a handful of types of kind handle (including polymorphic types!), which would be represented using the same u32 index space as fd. However, in that presentation, I called them cap as short for capability, and used a syntax that was easier to read on the slides than witx. I've been talked into believing handle is a better name than cap or capability. I prefer handle over descriptor because the word is shorter to say and type, but my opinion on that is not very strong.

I'm not a fan of calling them reference, because that name is already used for a closely related concept in the WebAssembly reference types proposal. When we are able to specify WASI in terms of interface types, we will have the flexibility to use either a typed reference, or an integer index (same as now), as the concrete representation for the abstract handle, depending on what the consumer of the interface wants.

NotWearingPants commented 5 years ago

Here's my take on why this naming is a problem:

The problem is not whether it's descriptor or handle or reference, the problem is that it is explicitly a file, when it is actually a generic resource.

So maybe we could call it rd for "resource descriptor", or rh for "resource handle", or just descriptor/handle.

As for the type system POV mentioned by @sbc100, I agree typing is good, so we should have fd inherit from rd as opposed to fd being the base type.

igrep commented 4 years ago

Besides the name (I like both descriptor and handle!), I agree with this change because it makes some kinds of WASI hosts simpler: They don't have to manage a table mapping an FD (currently u32) to the object representing I/O stream anymore.

I suppose (and am implementing) a WASI host app which forwards the output to its view when WASI modules write some bytes onto stdout/stderr. In such apps, there's only objects which can receive/send byte strings and do something with them (e.g. OutputStream and InputStream in Java, console.log in the browser etc.), while there are no mappings from FDs to such objects. Because they are neither files nor sockets, managed by the host's OS. So, to implement such apps, we have to create a table which maps an FD to the I/O stream objects by ourselves. I think that is an extra task.

But one thing to note: in addition to this change, we should add stdin / stdout / stderr to WASI as a handle. Because they can't be directly referred by wasm module if they are handles (similar to anyref) instead of u32s.