WebAssembly / simd

Branch of the spec repo scoped to discussion of SIMD in WebAssembly
Other
530 stars 43 forks source link

Use specific `uN` types for `ImmLaneIdxM` representations #256

Open RReverser opened 4 years ago

RReverser commented 4 years ago

This is a follow-up to https://github.com/WebAssembly/simd/issues/242 and an extraction from the discussion in https://github.com/WebAssembly/threads/issues/162#issuecomment-640792474.

Short version: could we define binary representations of ImmLaneIdx* via custom-length u* types? The WebAssembly grammar already supports arbitrary-sized integers, and normally post-validation step is used for things we'd like to extend or change in the future.

I'd argue that all usages of ImmLaneIdx* do not fall under this category, as they are in instructions that make sense only with index in the strict given range. So an invalid index should be considered malformed and caught at the parsing stage rather than via post-validation.

More specifically, I propose the following representations in BinarySIMD.md:

Production Type
ImmLaneIdx2 u1
ImmLaneIdx4 u2
ImmLaneIdx8 u3
ImmLaneIdx16 u4

Thoughts?

tlively commented 4 years ago

To be clear, you're just proposing a change to the boundary between an invalid and a malformed module, right? Not a change to the binary encoding?

Seems ok to me, but we should consider @binji's comment from the other thread:

I think you could handle it in the binary grammar, since you always know the lane type from the instruction opcode. But if we wanted to handle the syntax as we do with other instructions (grouping them by the kind of instruction they are, e.g. extract_lane, replace_lane) then we'd need to have a union at that point.

I would be in favor of leaving this to whomever (@dtig?) is writing and reviewing the formal spec to give them the flexibility to simplify things as they see fit.

RReverser commented 4 years ago

To be clear, you're just proposing a change to the boundary between an invalid and a malformed module, right?

Yep.

dtig commented 4 years ago

I don't see a problem with this, also I think we already catch this as a validation failure in the V8 implementation even though it's not clearly specified here, as intuitively it seemed like the this should be a validation failure - though I can see how as it's not explicit, this can vary based on implementation. Regarding the snippet of the comment, we already need a union of these operations anyway because they don't cleanly fit into the semantics of other operations that are currently in the spec.

Though for the purposes of the overview readability I'd prefer that we retain the ImmLaneIdx* representation, and add more details about invalid indices, or maybe that's what @RReverser is proposing?

binji commented 4 years ago

@dtig right, the change here would make the following a malformed module instead of an invalid module: (i32x4.extract_lane 5) or the equivalent binary \xfd\x1b\x05. AFAIK, VMs don't often make a distinction, but the spec does. Currently the binary format uses u8, meaning that any value in the range [0,255] is well-formed, but potentially invalid. This suggestion intends to make the well-formed ranges smaller, to better match the requirements of the instruction.

ngzhian commented 4 years ago

The current spec text [0] uses a single definition of "lane index" for simplicity, and it is defined as a u8. Then later we do the actual bounds check in the validation [1]. This is simple because the definition of all the extract_lane and replace_lane can be grouped.

If we had separate laneidx2, laneidx4 etc, we first need to define each of these valid lane indices, then we will need to spell out individual extract_lane and replace_lane instructions in the syntax [0]. But then we won't need to mention the bounds check in validation at all.

I don't think either is more simple, just different presentation.

[0] https://www.ngzhian.com/simd/core/syntax/instructions.html#simd-instructions [1] https://www.ngzhian.com/simd/core/valid/instructions.html#xref-syntax-instructions-syntax-shape-mathit-shape-mathsf-xref-syntax-instructions-syntax-instr-simd-mathsf-extract-lane-mathsf-xref-syntax-instructions-syntax-sx-mathit-sx-xref-syntax-instructions-syntax-laneidx-mathit-laneidx

RReverser commented 4 years ago

I don't think either is more simple, just different presentation.

Right, it's just about moving checks earlier so that such mismatch could be reported by parsers rather than validators.

ngzhian commented 4 years ago

Yea agree, the point of my post was to highlight what spec changes will need to be made if we were to make this change - which isn't a lot. We can probably discuss this at the next sync #338

abrown commented 4 years ago

@alexcrichton, @yurydelendik: I haven't read this too closely but from the subgroup meeting I think this is forward and may affect https://github.com/bytecodealliance/wasm-tools.

ngzhian commented 4 years ago

From today's sync meeting, it was agreed that we can make this change:

If anyone foresees issues with this change, please raise it up (thanks Andrew for cc-ing).

RReverser commented 4 years ago

Thanks!