riscv / sail-riscv

Sail RISC-V model
https://lists.riscv.org/g/tech-golden-model
Other
407 stars 148 forks source link

`vsetvli` vector tail agnostic and vector mask agnostic operands are mandatory #489

Open ThinkOpenly opened 1 month ago

ThinkOpenly commented 1 month ago

In the RVV 1.0 spec, there is this text (in 3.4.3):

The assembly syntax adds two mandatory flags to the vsetvli instruction [...]

(I'm not sure why "adds" was used there.) The above text is complemented by a note:

Prior to v0.9, when these flags were not specified on a vsetvli, they defaulted to mask-undisturbed/tail-undisturbed. The use of vsetvli without these flags is deprecated, however, and specifying a flag setting is now mandatory. The default should perhaps be tail-agnostic/mask-agnostic, so software has to specify when it cares about the non-participating elements, but given the historical meaning of the instruction prior to introduction of these flags, it was decided to always require them in future assembly code.

Note that the vector tail agnostic and vector mask agnostic operands are mandatory.

I cannot find an analogous statement about the lmul operand, but given its position in the middle of the list of operands, it would seem reasonable to infer that it is also mandatory.

Unfortunately, there are examples of vsetvli without these "mandatory" operands. In (6):

vsetvli rd, rs1, vtypei # rd = new vl, rs1 = AVL, vtypei = new vtype setting

in (6.1):

vsetvli t0, a0, e8          # SEW= 8, LMUL=1
vsetvli t0, a0, e8, m2      # SEW= 8, LMUL=2
vsetvli t0, a0, e32, mf2    # SEW=32, LMUL=1/2

in (7.9):

vsetvli t1, x0, e8, m8     # Maximum VLMAX
vlm.v v0, (a0)             # Load mask register
vsetvli x0, t0, <new type> # Restore vl (potentially already present)

In model/riscv_insts_vext_vset.ml, these operands are supported as optional:

mapping maybe_lmul_flag : string <-> bits(3) = {
  ""              <-> 0b000, /* m1 by default */
  sep() ^ "mf8"   <-> 0b101,
  sep() ^ "mf4"   <-> 0b110,
  sep() ^ "mf2"   <-> 0b111,
  sep() ^ "m1"    <-> 0b000,
  sep() ^ "m2"    <-> 0b001,
  sep() ^ "m4"    <-> 0b010,
  sep() ^ "m8"    <-> 0b011
}

mapping maybe_ta_flag : string <-> bits(1) = {
  ""           <-> 0b0, /* tu by default */
  sep() ^ "ta" <-> 0b1,
  sep() ^ "tu" <-> 0b0
}

mapping maybe_ma_flag : string <-> bits(1) = {
  ""           <-> 0b0, /* mu by default */
  sep() ^ "ma" <-> 0b1,
  sep() ^ "mu" <-> 0b0
}
[...]
mapping clause assembly = VSETVLI(ma, ta, sew, lmul, rs1, rd)
  <-> "vsetvli" ^ spc() ^ reg_name(rd) ^ sep() ^ reg_name(rs1) ^ sep() ^ sew_flag(sew) ^ maybe_lmul_flag(lmul) ^ maybe_ta_flag(ta) ^ maybe_ma_flag(ma)

I believe a spec-conforming Sail implementation should not have the "by default" cases in the "maybe" functions (and the "maybe" functions should be renamed to remove the "maybe").

(@XinlaiWan?)

ThinkOpenly commented 1 month ago

I cannot find an analogous statement about the lmul operand, but given its position in the middle of the list of operands, it would seem reasonable to infer that it is also mandatory.

Section 6.1 has text and an example treating lmul as an optional operand:

m1 # LMUL=1, assumed if m setting absent
[...]
Examples:
    vsetvli t0, a0, e8          # SEW= 8, LMUL=1
    vsetvli t0, a0, e8, m2      # SEW= 8, LMUL=2
    vsetvli t0, a0, e32, mf2    # SEW=32, LMUL=1/2

I presume, then, that lmul is indeed optional, whereas vta and vma are not.

latifbhatti commented 1 month ago

When lmul is absent, as in the instruction vsetvli t0, a0, e8, it implies that the vector register group multiplier (lmul) is set to 1. This default setting is consistent even when we explicitly define vsetvli t0, a0, e8, m1; the lmul value remains 1. However, for large-scale applications requiring different lmul configurations, the lmul value is tailored to meet the specific needs of the application. example(usage of ta and tu) Regarding the usage of tail agnostic: When we set lmul to 8, sew (standard element width) to 32, and vlen (vector length) to 256, it means we are operating on 64 elements simultaneously. If the intention is to work with only 50 elements, the behavior of the remaining 14 elements is determined by the tail agnosticism setting. If ta (tail agnostic) is set to 1, the remaining elements are set to 1. Conversely, if tu (tail undisturbed) is set to 0, the previous values(we use previously store elements in vector regester) of the last 14 elements remain unchanged, ensuring no alterations occur within these elements.

Masking Explanation Masking allows selective processing of elements. For instance, if we want to operate on specific elements, such as (2,4,5,6,7,8,12,22,34), we apply masking. In an arithmetic instruction with a masking bit set to 25, this indicates that masking is active. The active elements are placed in vector register 0. Setting ma=1 in vsetvli means inactive elements are filled with 1s. If mu is set to 0, inactive elements retain their previous values, ensuring they remain unaltered.

ThinkOpenly commented 1 month ago

When lmul is absent, as in the instruction vsetvli t0, a0, e8, it implies that the vector register group multiplier (lmul) is set to 1.

I agree with the lmul assertion.

I contend, on the other hand, that vsetvli t0, a0, e8 is not valid syntax because vta and vma operands are mandatory. Correct syntax would include both mandatory operands as in vsetvli t0, a0, e8, ta, ma.

latifbhatti commented 1 month ago

I forgot to mention the ‘ta’ and ‘ma’ mandatory flags in the vsetvli setting.this flags are mandotory.


# Set vector configuration with lmul=8, sew=32, vlen=256,ta=1,ma=1
addi a0,zero,50
vsetvli t0, a0, e32, m8 ,ta,ma

    vadd.vi v0, v0, 10
    vadd.vi v16, v8, 11, v0.t
    vadd.vv v8, v16, v0
    vadd.vi v16, v8, 7, v0.t
    vadd.vx v8, v16, x1```
KotorinMinami commented 1 month ago

I think this makes a lot of sense. Additionally, @Alasdair we know that the assemble part (string -> AST) of the mapping has no semantic effect on the model and the only place the assembly side of the mapping would really be visible is in the JSON output for documentation. However, what about the disassembly part? Given that the binary values have two corresponding result mappings, how would the Sail simulator output the related disassembly instructions? If it outputs the default option, I think this issue falls into the same category as the previous one #21 . What do you think?

jrtc27 commented 1 month ago

Like any other kind of match it'll pick the first option that matches.

KotorinMinami commented 1 month ago

In this case, this means that the disassembly part will never output the 'mu/tu', right? If so, I think this could indeed lead to some confusion when using Sail, as there's a discrepancy between Sail's output and what the RISC-V specification states. Therefore, I believe the default part may be removed, and deleting this part shouldn't have any significant impact on the model.