Closed rossberg closed 1 year ago
Looks OK.
The bulk array instructions (from #363) appear to be missing.
Personally, I would have compacted the opcode encoding space a lot more, essentially to assign a contiguous range. My reasoning is that whatever gaps we choose to leave now have an exceedingly small likelihood to turn out to have appropriate size, so as we add instructions in future proposals we'll likely end up with a wildly fragmented mix of remaining gaps that were too large and others that were too small where we then had to simply append new opcodes at the end of the range or squeeze them into "non-fitting" gaps. OTOH, it's not a big issue because (1) at the end of the day we're just choosing between different kinds of ugly for the encoding space a couple of proposals down the line, and (2) prettiness or ugliness of the binary encoding doesn't actually matter.
FWIW, we tried to leave useful gaps in the SIMD proposal opcode space, and they did not turn out to be useful at all. If we had instead used a compact opcode space, we might not have needed to use an extra LEB byte for the relaxed SIMD opcodes.
I don't feel strongly either way, though.
+1 to just compacting the opcode space without gaps. The 1-byte opcode space is important, but IMO there is marginal benefit, if any, in leaving holes in the prefixed opcode spaces.
Nit: I'd suggest to keep different encodings for the same instruction together, i.e. put ref.test (ref null ht)
at 0xfb41and ref.cast (ref ht)
at 0xfb42.
I agree that we should fill up gaps before going to two bytes, but currently we are far away from hitting that boundary, and never may. Is there any particular advantage in compacting before that? I thought that leaving some gaps actually worked reasonably well for some of the Wasm 1.0 code space.
@askeksa-google, good point, fixed.
Is there any particular advantage to leaving gaps?
FWIW, here's the summary:
This PR currently lists 27 0xfb
-prefixed instructions (still missing 4 array bulk instructions), and pads them with 53 gaps. These gaps are located in the following ranges of opcodes:
0x4f
is seen as the end of the range)One data point to illustrate the low likelihood of gap sizes being lucky guesses: Aske has previously suggested a set of 8 branching i31 instructions, which (if we end up deciding to adopt them in a future proposal) wouldn't have an obvious fit in any of these gaps.
I don't think we should leave gaps in the prefixed opcode spaces. First, as mentioned above, I think it's unlikely we'll predict the size of gaps properly, meaning there will likely always be vestigial gaps. Second, leaving gaps reduces the number of two-byte encodings being utilized; in the future, will new instructions start filling in gaps if they'd otherwise overflow into the 3 byte space? Third, engines may way to use some of the unused opcode space for internal things, like quickened versions of bytecodes, which benefits interpreters.
Coming back to this:
@titzer, I didn't follow your point about custom opcodes. Isn't that orthogonal?
@titzer, I didn't follow your point about custom opcodes. Isn't that orthogonal?
It's a minor point, but it's often useful to have a few opcodes left over for the VM, in each of the opcode spaces. This is easier to do if the opcode space fills up from the lowest numbers first. It doesn't work very well to try to use an opcode that is currently a hole and later filled up.
We left gaps in the MVP opcodes, and that proved relatively useful, both for new types and for new control instructions.
The 1-byte opcode space is a much more important space, but in retrospect, I think we would have been fine to have not left holes in it.
This is easier to do if the opcode space fills up from the lowest numbers first.
I see, but this is mostly a question of leaving sufficient space at the end, right? Which I would think we have even now.
Let's discuss this at the subgroup meeting tomorrow. If we don't come to an agreement through discussion, I propose that we take a popularity vote to settle the issue. It doesn't seem important enough to be worth spending much additional energy or time on.
I won't be able to attend tomorrow because of a conflict, but I would generally prefer that we pack the prefixed opcode spaces rather than leaving holes. Is a wider CG discussion warranted, as this could be a potentially precedent-setting decision?
If the purpose is to have some nice groupings, we could still do that while leaving significantly smaller holes, e.g.:
0x00+
: Struct instructions0x10+
: Array instructions (including bulk)0x20+
: Conversions (extern conversions from 0x24
)0x28+
: CastsIs a wider CG discussion warranted, as this could be a potentially precedent-setting decision?
Maybe, but it seems extremely low priority and CG time is hard to come by these days.
We didn't have a critical mass of people to resolve this at the meeting yesterday, but I'd like to take the temperature here. We are only adding 31 prefixed opcodes, so even if we leave generous holes, we will not be close to overflowing to the second LEB byte, which would require 128 opcodes. Given that, do you:
🚀 : prefer leaving holes ❤️ : prefer not leaving holes 👀 : not care at all
Whatever direction this goes in, I think it will have significant aesthetic value, and probably some practical value as well, if encodings that are obvious pairs (last 8 in the currently proposed ordering in particular) are aligned to only differ in the lsb.
As discussed at the last subgroup meeting, I've pushed a commit that simply removes all of the holes in the instruction opcode space but preserves the order of instructions.
Here's a sheet to help visualize the difference: https://docs.google.com/spreadsheets/d/1eaMdpL-MwLOwWgj-mWgv_tNq5crP_nemQ-PW54ss-Q8/edit?usp=sharing
Unfortunately, when you just close the holes like that, the obvious pairs of instructions all start on odd numbers, so they differ in their low two bits rather than just their lowest bit. I added a third column to the spreadsheet skipping a single opcode to "fix" that, but I don't think it's worth it.
I do think it's worth skipping an opcode to align the pairs, especially since for ref.test
and ref.cast
, the lsb then becomes a null flag with the same meaning as the null flags in br_on_cast[_fail]
.
We still stay within the first 32 opcodes.
Another way we could align the pairs would be to take the three i31 operations and move them to the end. @rossberg, would you prefer to have a single hole before the i31 operations or to move the i31 operations to the end?
I don't care much either way, but if you change anything, please sync the interpreter as well in this PR, now that I already updated it.
If we move the i31
instructions to the end, I suggest we also move the extern
instructions after the casts, in order to have all "conversions" (extern
and i31
) grouped together.
FWIW, if we care for bits that way, shouldn't we also make it so that any _u/_s pairs of instructions only differ by LSB?
Perhaps, although I would say that struct.get{,_u,_s}
and array.get(,_u,_s}
are triples, so it doesn't matter so much there, leaving only i31.get_u
and i31.get_s
. @askeksa-google, do you have an opinion here?
I notice that this is already not the case for some MVP instructions, so I think we can leave it as is.
Cool, let's put this into something of a "final comment period." If you are ok with the currently proposed encoding, please give this comment a 🚀 react. Otherwise, please give it an 👀 react and comment with a specific change you would like to see. After a week or so, if we have consensus, we can go ahead and merge this.
Thanks, everyone. I'll go ahead and merge this now since there have been no objections.
Here is my proposal for the final opcode reordering (fixes #337 and #370).
Trying to group related instructions together and leave space for future extensions in respective opcode ranges.
Also reordering type opcodes to group nullary constructors together and before non-nullary ones (c.f. #337).