Final opcodes - Githubissues

rossberg commented 1 year ago

Here is my proposal for the final opcode reordering (fixes #337 and #370).

Trying to group related instructions together and leave space for future extensions in respective opcode ranges.

Also reordering type opcodes to group nullary constructors together and before non-nullary ones (c.f. #337).

jakobkummerow commented 1 year ago

Looks OK.

The bulk array instructions (from #363) appear to be missing.

Personally, I would have compacted the opcode encoding space a lot more, essentially to assign a contiguous range. My reasoning is that whatever gaps we choose to leave now have an exceedingly small likelihood to turn out to have appropriate size, so as we add instructions in future proposals we'll likely end up with a wildly fragmented mix of remaining gaps that were too large and others that were too small where we then had to simply append new opcodes at the end of the range or squeeze them into "non-fitting" gaps. OTOH, it's not a big issue because (1) at the end of the day we're just choosing between different kinds of ugly for the encoding space a couple of proposals down the line, and (2) prettiness or ugliness of the binary encoding doesn't actually matter.

tlively commented 1 year ago

FWIW, we tried to leave useful gaps in the SIMD proposal opcode space, and they did not turn out to be useful at all. If we had instead used a compact opcode space, we might not have needed to use an extra LEB byte for the relaxed SIMD opcodes.

I don't feel strongly either way, though.

titzer commented 1 year ago

+1 to just compacting the opcode space without gaps. The 1-byte opcode space is important, but IMO there is marginal benefit, if any, in leaving holes in the prefixed opcode spaces.

askeksa-google commented 1 year ago

Nit: I'd suggest to keep different encodings for the same instruction together, i.e. put ref.test (ref null ht) at 0xfb41and ref.cast (ref ht) at 0xfb42.

rossberg commented 1 year ago

I agree that we should fill up gaps before going to two bytes, but currently we are far away from hitting that boundary, and never may. Is there any particular advantage in compacting before that? I thought that leaving some gaps actually worked reasonably well for some of the Wasm 1.0 code space.

@askeksa-google, good point, fixed.

jakobkummerow commented 1 year ago

Is there any particular advantage to leaving gaps?

FWIW, here's the summary: This PR currently lists 27 0xfb-prefixed instructions (still missing 4 array bulk instructions), and pads them with 53 gaps. These gaps are located in the following ranges of opcodes:

6 in "struct allocators"
4 in "struct accessors"
11 in "array allocators" (4 of which might be used for the bulk instructions)
11 in "array accessors" (4 of which might be used for the bulk instructions)
5 in "i31 instructions"
6 in "extern conversions"
4 in "cast instructions", non-branching group
6 in "cast instructions", branching group (assuming 0x4f is seen as the end of the range)

One data point to illustrate the low likelihood of gap sizes being lucky guesses: Aske has previously suggested a set of 8 branching i31 instructions, which (if we end up deciding to adopt them in a future proposal) wouldn't have an obvious fit in any of these gaps.

titzer commented 1 year ago

I don't think we should leave gaps in the prefixed opcode spaces. First, as mentioned above, I think it's unlikely we'll predict the size of gaps properly, meaning there will likely always be vestigial gaps. Second, leaving gaps reduces the number of two-byte encodings being utilized; in the future, will new instructions start filling in gaps if they'd otherwise overflow into the 3 byte space? Third, engines may way to use some of the unused opcode space for internal things, like quickened versions of bytecodes, which benefits interpreters.

rossberg commented 1 year ago

Coming back to this:

We left gaps in the MVP opcodes, and that proved relatively useful, both for new types and for new control instructions.
The proposed opcodes do not just leave gaps, they also provide some additional prefixing hints. For example,
- 0xfb0X struct instructions
- 0xfb1X/0xfb2X array instructions
- 0xfb3X other reference instructions
- 0xfb4X cast instructions I occasionally found such a thing useful in the past when I had to stare at binary code.
I happily believe this approach failed for SIMD, but that's because SIMD is extremely large and irregular.
I'm not sure I see the disadvantages. Note that we still have plenty of room in the single-byte space. We can still start filling up gaps or flow over into other prefixes once the need arises.

@titzer, I didn't follow your point about custom opcodes. Isn't that orthogonal?

titzer commented 1 year ago

@titzer, I didn't follow your point about custom opcodes. Isn't that orthogonal?

It's a minor point, but it's often useful to have a few opcodes left over for the VM, in each of the opcode spaces. This is easier to do if the opcode space fills up from the lowest numbers first. It doesn't work very well to try to use an opcode that is currently a hole and later filled up.

We left gaps in the MVP opcodes, and that proved relatively useful, both for new types and for new control instructions.

The 1-byte opcode space is a much more important space, but in retrospect, I think we would have been fine to have not left holes in it.

rossberg commented 1 year ago

This is easier to do if the opcode space fills up from the lowest numbers first.

I see, but this is mostly a question of leaving sufficient space at the end, right? Which I would think we have even now.

tlively commented 1 year ago

Let's discuss this at the subgroup meeting tomorrow. If we don't come to an agreement through discussion, I propose that we take a popularity vote to settle the issue. It doesn't seem important enough to be worth spending much additional energy or time on.

titzer commented 1 year ago

I won't be able to attend tomorrow because of a conflict, but I would generally prefer that we pack the prefixed opcode spaces rather than leaving holes. Is a wider CG discussion warranted, as this could be a potentially precedent-setting decision?

askeksa-google commented 1 year ago

If the purpose is to have some nice groupings, we could still do that while leaving significantly smaller holes, e.g.:

0x00+: Struct instructions
0x10+: Array instructions (including bulk)
0x20+: Conversions (extern conversions from 0x24)
0x28+: Casts

tlively commented 1 year ago

Is a wider CG discussion warranted, as this could be a potentially precedent-setting decision?

Maybe, but it seems extremely low priority and CG time is hard to come by these days.

We didn't have a critical mass of people to resolve this at the meeting yesterday, but I'd like to take the temperature here. We are only adding 31 prefixed opcodes, so even if we leave generous holes, we will not be close to overflowing to the second LEB byte, which would require 128 opcodes. Given that, do you:

🚀 : prefer leaving holes ❤️ : prefer not leaving holes 👀 : not care at all

askeksa-google commented 1 year ago

Whatever direction this goes in, I think it will have significant aesthetic value, and probably some practical value as well, if encodings that are obvious pairs (last 8 in the currently proposed ordering in particular) are aligned to only differ in the lsb.

tlively commented 1 year ago

As discussed at the last subgroup meeting, I've pushed a commit that simply removes all of the holes in the instruction opcode space but preserves the order of instructions.

Here's a sheet to help visualize the difference: https://docs.google.com/spreadsheets/d/1eaMdpL-MwLOwWgj-mWgv_tNq5crP_nemQ-PW54ss-Q8/edit?usp=sharing

Unfortunately, when you just close the holes like that, the obvious pairs of instructions all start on odd numbers, so they differ in their low two bits rather than just their lowest bit. I added a third column to the spreadsheet skipping a single opcode to "fix" that, but I don't think it's worth it.

askeksa-google commented 1 year ago

I do think it's worth skipping an opcode to align the pairs, especially since for ref.test and ref.cast, the lsb then becomes a null flag with the same meaning as the null flags in br_on_cast[_fail].

We still stay within the first 32 opcodes.

tlively commented 1 year ago

Another way we could align the pairs would be to take the three i31 operations and move them to the end. @rossberg, would you prefer to have a single hole before the i31 operations or to move the i31 operations to the end?

rossberg commented 1 year ago

I don't care much either way, but if you change anything, please sync the interpreter as well in this PR, now that I already updated it.

askeksa-google commented 1 year ago

If we move the i31 instructions to the end, I suggest we also move the extern instructions after the casts, in order to have all "conversions" (extern and i31) grouped together.

rossberg commented 1 year ago

FWIW, if we care for bits that way, shouldn't we also make it so that any _u/_s pairs of instructions only differ by LSB?

tlively commented 1 year ago

Perhaps, although I would say that struct.get{,_u,_s} and array.get(,_u,_s} are triples, so it doesn't matter so much there, leaving only i31.get_u and i31.get_s. @askeksa-google, do you have an opinion here?

rossberg commented 1 year ago

I notice that this is already not the case for some MVP instructions, so I think we can leave it as is.

tlively commented 1 year ago

Cool, let's put this into something of a "final comment period." If you are ok with the currently proposed encoding, please give this comment a 🚀 react. Otherwise, please give it an 👀 react and comment with a specific change you would like to see. After a week or so, if we have consensus, we can go ahead and merge this.

tlively commented 1 year ago

Thanks, everyone. I'll go ahead and merge this now since there have been no objections.

WebAssembly / gc

Final opcodes #372