Relax requirements on assigning integers to resource handles

alexcrichton commented 2 months ago

Currently in the canonical ABI resources (own<T> and borrow<T>) are required to strictly follow the specification in terms of how integers are assigned to handles as they are created. Specifically it's required that engines implement per-resource tables with a slab allocation scheme which is a LIFO allocator for resource indices. In the context of this discussion on Zulip though it's been found that while this is a compact indexing scheme it has a drawback of creating situations which can be difficult to debug. For example creating an initial resource A gives handle index 1, and then creating an initial resource B also gives handle index 1. If these handle indices are accidentally passed to a function that wants resource C then all the runtime can do is say "unknown handle index". The runtime doesn't know whether the "1" is of type A or of type B.

The specific confusing case in question that came up on Zulip was that wasi:io/poll/pollable@0.2.1 was created with index 1 and then passed to a function that wanted wasi:io/poll/pollable@0.2.0. This ended up (rightfully) not working and the runtime raised a trap, but it was a difficult-to-debug situation to determine that this was happening.

In discussion with some folks we had the idea of possibly relaxing the requirements of exactly how indices are allocated. I believe our rough conclusion is that the spec algorithm would change to allocating a random, but unique, index per resource type. This is a change from the slab allocation of today to be random instead (but still unique).

This is a small and subtle change, but the intention is to enable this to preserve the best guarantees that both guests and hosts today have without actually breaking anything in practice. Notably:

This enables true RNG generation of handle indices for "fuzzing" a component if desired. This can help weed out accidental mistakes in guests for example.
This enables the host to have a single table of all types of resources. Because the spec says handle indices are random there's no reason that "random" can't mean "it's sequential" for example. This would help solve the above issue because with a single table of all resources the host could provide a better error message.
All existing hosts which implement the component model already match this new specification. The "random" behavior just happens to look like a slab.

We'd probably want to document that randomness is not guaranteed so the handle index shouldn't be used as a seed for a CSPRNG for example, but other than that my hope is that while this would complicate the canonical ABI Python bits it would in the end grant hosts flexibility to choose the best indexing scheme to match their needs (or perhaps providing a configuration knob to select a particular indexing scheme)

sunfishcode commented 2 months ago

A possible alternative here would be to leave the spec as-is, and just observe that it's common for debugging features to deviate from specs, and that implementations could have debugging modes where they randomize or ensure uniqueness across types or other things. This would support the fuzzing scenario.

It wouldn't support hosts using a single table for all types though. On the other hand, it would theoretically make it less likely that guest code could come to depend on hosts that use a single table for all types.

I don't have a strong opinion which way is best here; I just wanted to mention this approach as an option.

lukewagner commented 1 month ago

I definitely see the value of making it easier to develop and catch bugs by having a unique index space shared by all resource types; it mostly just seems like a question of what the best technical approach is to achieving this.

One risk is that, whether or not the spec specifies deterministic indices, if popular runtimes only exhibit one behavior in practice during normal execution (e.g., unless the developer sets a flag), then code will end up accidentally depending on that one behavior and break when a runtime tries to take full advantage of the nondeterminism allowed (or not allowed) by the spec. This could inadvertently make the debugging use case worse because, when I flip the "catch bugs" flag, I might end up triggering some separate bug unrelated to the real bug I'm trying to track down.

One way to catch these accidental dependencies early is to have normal/default execution mode actively take advantage of the nondeterminism. For some types of nondeterminism (e.g., preemptive threads), this happens naturally. But I expect in the case of resource handles, runtimes mostly won't want to do this by default and will mostly just copy each other's behavior.

As an alternative to consider: what if we kept determinism but switched to a single resource table? While it's nice to eliminate the runtime type check w/ separate resource tables; I expect in practice this could be compiled down to a cheap branch that would be amortized by the overall call.

alexcrichton commented 1 month ago

I personally agree that a single table is probably better than what we have today for debuggabilty and "probably the same perf" reasons you mention. I'd also personally still prefer to at least try to spec random indices being possible, but I don't disagree that it seems unlikely to stick in practice.

lukewagner commented 1 month ago

Thinking about this a bit more, one thing that seems potentially useful for bindings/runtime glue code is knowing that indices are mostly dense. If you can assume that, then if you want to associate state with a C-M table element (which I think will end up being common for async subtasks, streams and futures), you can simply maintain a mirror dense array in linear memory whereas, if the indices are sparse, you'd need a map of some sort which will be somewhat more expensive. That doesn't necessarily force determinism, but it does suggest against allowing random indices that range over [1,2³²].

WebAssembly / component-model

Relax requirements on assigning integers to resource handles #395