Closed badeend closed 7 months ago
That's a really great idea. I feel like we even discussed this at some point in the past and then I forgot to include this in CanonicalABI.md. In particular, in the Preview 3 timeframe, I'm expecting there to be some canonical built-ins that want to optionally return a handle which would benefit from being able to simply return an i32
.
-1
is another option for a sentinel value worth considering. I just noticed that CanonicalABI.md is missing an integer overflow trap btw... but assuming I add that, we could just decrease the max. 0
seems like the better sentinel value for catching errors, as you pointed out. However, 0
means either burning an extra element per resource table or adding an extra subtraction on the access path. Making 0
a sentinel would technically be a breaking change (since handle indices are currently deterministically specified), but it seems like code "shouldn't" depend on the precise indices so maybe it's not in practice.
@alexcrichton (and anyone else) WDYT?
Until very recently I made the wasmtime-wasi implementation refuse to use indices 0,1,2 for pseudo-resources, and then resources, in order to catch programs that were accidentally assuming stdio lived there. This did catch some bugs in the early days of pseudo-resources, but I'm not sure if it would since we introduced proper resources in Wasmtime. We got rid of that special case recently on the grounds that we probably shook out all of those bugs. I would be fine with this change.
-1 is another option for a sentinel value worth considering. I just noticed that CanonicalABI.md is missing an integer overflow trap btw... but assuming I add that, we could just decrease the max. 0 seems like the better sentinel value for catching errors, as you pointed out.
0
would definitely be my preferences because of the pervasiveness of 0 initialization/coercion in the programming language design universe. (In)famous example being JavaScript: Number(null) === 0
.
However, 0 means either burning an extra element per resource table (...)
As an example; C# uses 0-initialization for all fields. Removing 0 as a valid handle value would enable handle types to be safely emitted as struct
s instead of class
es, removing the need to perform a heap allocation per handle. So in this specific case it would indeed add a one-time per-table cost on the host side, but could remove a per-resource cost on the client side.
Making 0 a sentinel would technically be a breaking change (since handle indices are currently deterministically specified), but it seems like code "shouldn't" depend on the precise indices so maybe it's not in practice.
If it helps, I could submit PRs to existing component-model implementations (wasmtime, JCO, others?) to already start reserving the 0th index. 1) to minimize the practical impact of a potential transition to preview3, and 2) to gather real world experience on how much overhead we're actually talking about here.
I think this would be a great change to make! I also would agree we can probably do this today without actually breaking anything in practice except for some tests in Wasmtime. I also think it'd be reasonable to reserve 0 and -1 at the same time too.
Thanks for the feedback all. Good idea about reserving both 0
and -1
. I even wonder if, given that 1 billion handles seems like Enough For Anyone (and if not, 4 billion is probably not enough either and so what you really want is to use an i64
opted-into via new canonopt
) it would be a good idea to set the max to 2**30 - 1
so that the high two bits are free (e.g., for use in conjunction with Smis). I'm happy to propose that change.
Would it make sense to also reserve 1 and 2 too, to catch misuses of stdout/stderr? Or perhaps even more low values, to help catch conflations of libc fds in general with canonical abi handle indices?
I think it's reasonable to chop off the upper two bits yeah. I'm not sure where to draw the line on the lower bound though. Zero seems like a good idea to rule out since lots of things like to zero-initialize things into a "null" state, but anything else to me feels like it's almost a bit too arbitrary to bake into the canonical ABI. For example folks are already going to have to deal with other surprises such as there are two handles with index N - just different types. While we could play whack-a-mole with trying to cover up gotchas (like 1/2 in terms of stdio/stderr) I'm not sure if it's worth it per se.
I don't really feel strongly either way though.
As long as we're happily bikeshedding, let me pile on another one:
I'm not sure where to draw the line on the lower bound though
How about 2147483648 😛 ? Ie. start numbering handles from s32::MIN, incrementing up to (but not including) 0.
0
: Never a valid handle. This solves my original question.1
: Never a valid handle. This solves Dan's remark.2
: Never a valid handle. This solves Dan's remark.0 through 0x7FFFFFFF
: Never a valid handle. Usable by applications, similar to wasm-gc's i31
-1
: Technically a valid handle number, but this will only be allocated at the very last. So in practice probably good enough for catching off-by-one errors.It requires a bit-masking operation on the data access path, though.
I (non-strongly) feel the same as Alex, although definitely an interesting idea. While 0
as "null" is rather universal, 1
and 2
as stdout/stderr is rather specific to just file descriptors. Using negative numbers is also an interesting idea, but I worry it creates more trouble than it's worth (bitwise arithmetic, harder to read when debugging, source of bugs if guest code wants to covert to a dense array index, ...).
I feel there is a consensus on reserving at least 0
, so why not start with that and see what happens from there.
Ok, PR up in #284.
to be an always-invalid "NULL" handle?
Two benefits that come to mind for client languages:
Option<NonZeroU32>
)