riscv / sail-riscv

Sail RISC-V model
https://lists.riscv.org/g/tech-golden-model
Other
461 stars 167 forks source link

Vector CSRs #233

Open charmitro opened 1 year ago

charmitro commented 1 year ago

Based on the Vector Specification, vset{i}vl{i} should update the values of the Vector CSR(vl, vma, vta, vsew ,vlmul). But this doesn't seem to be the case when emulating with SAIL.

Comparing csrr rd, csr instruction outputs on SAIL & QEMU.

SAIL

In the code below, next to each instruction I have commented what the output of each instruction was,

csrr a0, vstart     ; 0x0000000000000000
csrr a0, vxsat      ; 0x0000000000000000
csrr a0, vxrm       ; 0x0000000000000000
csrr a0, vcsr       ; 0x0000000000000000
csrr a0, vl     ; 0x0000000000000010
csrr a0, vtype      ; 0x0000000000000000
csrr a0, vlenb      ; 0x0000000000000000

vsetivli x0, 4, e32, m1, ta, ma     

csrr a0, vstart     ; 0x0000000000000000
csrr a0, vxsat      ; 0x0000000000000000
csrr a0, vxrm       ; 0x0000000000000000
csrr a0, vcsr       ; 0x0000000000000000
csrr a0, vl     ; 0x0000000000000001
csrr a0, vtype      ; 0x00000000000000D0
csrr a0, vlenb      ; 0x0000000000000000

We can see that the CSRs are unchanged(except the case of vl & vtype).

QEMU

Comments contain the same information as the SAIL code snippet above, but with what GDB showed as results.

csrr a0, vstart     ; 0x0 0
csrr a0, vxsat      ; 0x0 0
csrr a0, vxrm       ; 0x0 0
csrr a0, vcsr       ; 0x0 0
csrr a0, vl     ; 0x0 0
csrr a0, vtype      ; 0x0 0
csrr a0, vlenb      ; 0x10 16

vsetivli x0, 4, e32, m1, ta, ma     

csrr a0, vstart     ; 0x0 0
csrr a0, vxsat      ; 0x0 0
csrr a0, vxrm       ; 0x0 0
csrr a0, vcsr       ; 0x0 0
csrr a0, vl     ; 0x4 4
csrr a0, vtype      ; 0xd0 208
csrr a0, vlenb      ; 0x10 16

We can notice that QEMU emulation altered the CSRs values as instructed via the vsetivli instruction, but in the case of SAIL emulation, most CSRs stay the same.

This was observed in the latest revision of the vector-dev branch.

cmuellner commented 1 year ago

CC @jrtc27 @billmcspadden-riscv @XinlaiWan @martinberger @allenjbaum

cmuellner commented 1 year ago

Note that this blocks the development of Zvk* support, where we need to iterate over element groups.

Related to that, the following function is most likely wrong as well:

/* num_elem means max(VLMAX,VLEN/SEW)) according to Section 5.4 of RVV spec */
val get_num_elem : (int, int) -> nat effect {escape, rreg}
function get_num_elem(LMUL_pow, SEW) = {
  let VLEN = int_power(2, get_vlen_pow());
  let LMUL_pow_reg = if LMUL_pow < 0 then 0 else LMUL_pow;
  /* Ignore lmul < 1 so that the entire vreg is read, allowing all masking to
   * be handled in init_masked_result */
  let num_elem = int_power(2, LMUL_pow_reg) * VLEN / SEW;
  assert(num_elem > 0);
  num_elem
}

The fact, that read-only CSRs (vl, vtype, and vlenb) can be written to at any time is probably also a bad idea. This is probably done to initialize VLEN. But if that's the case then there needs to be a lock-down mechanism, that ensures correct behavior when writing read-only CSRs.

XinlaiWan commented 1 year ago

Note that this blocks the development of Zvk* support, where we need to iterate over element groups.

Related to that, the following function is most likely wrong as well:

/* num_elem means max(VLMAX,VLEN/SEW)) according to Section 5.4 of RVV spec */
val get_num_elem : (int, int) -> nat effect {escape, rreg}
function get_num_elem(LMUL_pow, SEW) = {
  let VLEN = int_power(2, get_vlen_pow());
  let LMUL_pow_reg = if LMUL_pow < 0 then 0 else LMUL_pow;
  /* Ignore lmul < 1 so that the entire vreg is read, allowing all masking to
   * be handled in init_masked_result */
  let num_elem = int_power(2, LMUL_pow_reg) * VLEN / SEW;
  assert(num_elem > 0);
  num_elem
}
  • VLEN is the length of the vector register
  • vl is the vector length (up to this index, the vector register contents are active)
  • get_vlen_pow() converts the global vlen to a power-of-2 (0->5, 1->6, ...)
  • vlen (defined in model/riscv_vlen.sail) is not get set anywhere in the code...
  • As shown in the original post, LMUL is also not updated by the vsetivli instruction
  • So the result of this function will be always wrong
  • This function is called 63 times in the vector-dev branch
  • Most (al?) users of this function expect the number of active elements (to iterate over them)
  • The function does not honor vstart (see "prestart")

The fact, that read-only CSRs (vl, vtype, and vlenb) can be written to at any time is probably also a bad idea. This is probably done to initialize VLEN. But if that's the case then there needs to be a lock-down mechanism, that ensures correct behavior when writing read-only CSRs.

This function get_num_elem is to calculate the total number of elements that need to be updated, including prestart, active, inactive, and tail elements. And the identification of active elements uses the init_masked_result function.

The logic in our implementation is that we first use get_num_elem to determine the total number of elements to be processed. After knowing how many elements need to be read, we use read_vreg to read source and destination vector register groups as Sail vectors.

After that, the init_masked_result function will check prestart/active/inactive/tail elements and update prestart/inactive/tail elements according to the rule. The function then returns the result vector (with prestart/inactive/tail elements updated and active elements not updated) and the corresponding index information of the active elements (a vector with boolean elements). Then we can update the active elements according to the active index vector.

So the purpose of get_num_elem is not to calculate the number of active elements, and the identification of active element indexes is in another function. If there is inconvenience when using these functions, we can have some discussions and maybe add some new helper functions, but we actually have dealt with different element types in our implementation.

For the vsetivli issue, I'll check that as soon as possible.

XinlaiWan commented 1 year ago

"vlen (defined in model/riscv_vlen.sail) is not get set anywhere in the code" That's true ... In my pull requests, I mention that "ELEN and VLEN are manually specified in riscv_sys_control.sail when compiling the model". ELEN and VLEN maybe need to be specified in Sail configuration files in the future because they are determined by the user, but currently I have to manually set them every time I want to compile the model.

XinlaiWan commented 1 year ago

Based on the Vector Specification, vset{i}vl{i} should update the values of the Vector CSR(vl, vma, vta, vsew ,vlmul). But this doesn't seem to be the case when emulating with SAIL.

Comparing csrr rd, csr instruction outputs on SAIL & QEMU.

SAIL

In the code below, next to each instruction I have commented what the output of each instruction was,

csrr a0, vstart       ; 0x0000000000000000
csrr a0, vxsat        ; 0x0000000000000000
csrr a0, vxrm     ; 0x0000000000000000
csrr a0, vcsr     ; 0x0000000000000000
csrr a0, vl       ; 0x0000000000000010
csrr a0, vtype        ; 0x0000000000000000
csrr a0, vlenb        ; 0x0000000000000000

vsetivli x0, 4, e32, m1, ta, ma       

csrr a0, vstart       ; 0x0000000000000000
csrr a0, vxsat        ; 0x0000000000000000
csrr a0, vxrm     ; 0x0000000000000000
csrr a0, vcsr     ; 0x0000000000000000
csrr a0, vl       ; 0x0000000000000001
csrr a0, vtype        ; 0x00000000000000D0
csrr a0, vlenb        ; 0x0000000000000000

We can see that the CSRs are unchanged(except the case of vl & vtype).

QEMU

Comments contain the same information as the SAIL code snippet above, but with what GDB showed as results.

csrr a0, vstart       ; 0x0 0
csrr a0, vxsat        ; 0x0 0
csrr a0, vxrm     ; 0x0 0
csrr a0, vcsr     ; 0x0 0
csrr a0, vl       ; 0x0 0
csrr a0, vtype        ; 0x0 0
csrr a0, vlenb        ; 0x10 16

vsetivli x0, 4, e32, m1, ta, ma       

csrr a0, vstart       ; 0x0 0
csrr a0, vxsat        ; 0x0 0
csrr a0, vxrm     ; 0x0 0
csrr a0, vcsr     ; 0x0 0
csrr a0, vl       ; 0x4 4
csrr a0, vtype        ; 0xd0 208
csrr a0, vlenb        ; 0x10 16

We can notice that QEMU emulation altered the CSRs values as instructed via the vsetivli instruction, but in the case of SAIL emulation, most CSRs stay the same.

This was observed in the latest revision of the vector-dev branch.

In the comparison of results, firstly we can see the value of vtype is correct which means that vma, vta, vsew,vlmul can be correctly set by the vsetivli instruction. And the difference is the vl value and vlenb value. I think this issue may come from the VLEN setting.

Currently VLEN has to be manually specified in init_sys() function of riscv_sys_control.sail because of the issue I mentioned in the above conversation. If we don't manually specify that, I guess the default value is maybe 32 which is the smallest possible VLEN. In this case, if VLEN=32, the result vl value will be 1 which is shown in the Sail log. Please manually set VLEN to 128 (vlen = 0b0010) in init_sys() function of riscv_sys_control.sail and I guess vl will be the right value 4.

For vlenb, I did not implement its initialization logic because it's set as VLEN/8, and VLEN has no initialization logic now. So the log shows that it's always 0. vlenb is not used in the instruction execution of this implementation, so it does not affect the correctness of the model. But I'll add that logic if it's needed.

XinlaiWan commented 1 year ago

So I think it's finally a configuration issue. When user-specified configuration is supported, I'll add the logic of reading user-specified VLEN into the model and setting vlenb as VLEN/8. But now we don't have this logic, and VLEN (and the value of vlenb) need to be manually specified.

charmitro commented 1 year ago

So I think it's finally a configuration issue. When user-specified configuration is supported, I'll add the logic of reading user-specified VLEN into the model and setting vlenb as VLEN/8. But now we don't have this logic, and VLEN (and the value of vlenb) need to be manually specified.

Noted, I suggest that until then, a comment in the riscv_vlen.sail file should exist, specifying that it should be manually changed for now.

charmitro commented 1 year ago

Note that this blocks the development of Zvk* support, where we need to iterate over element groups. Related to that, the following function is most likely wrong as well:

/* num_elem means max(VLMAX,VLEN/SEW)) according to Section 5.4 of RVV spec */
val get_num_elem : (int, int) -> nat effect {escape, rreg}
function get_num_elem(LMUL_pow, SEW) = {
  let VLEN = int_power(2, get_vlen_pow());
  let LMUL_pow_reg = if LMUL_pow < 0 then 0 else LMUL_pow;
  /* Ignore lmul < 1 so that the entire vreg is read, allowing all masking to
   * be handled in init_masked_result */
  let num_elem = int_power(2, LMUL_pow_reg) * VLEN / SEW;
  assert(num_elem > 0);
  num_elem
}
  • VLEN is the length of the vector register
  • vl is the vector length (up to this index, the vector register contents are active)
  • get_vlen_pow() converts the global vlen to a power-of-2 (0->5, 1->6, ...)
  • vlen (defined in model/riscv_vlen.sail) is not get set anywhere in the code...
  • As shown in the original post, LMUL is also not updated by the vsetivli instruction
  • So the result of this function will be always wrong
  • This function is called 63 times in the vector-dev branch
  • Most (al?) users of this function expect the number of active elements (to iterate over them)
  • The function does not honor vstart (see "prestart")

The fact, that read-only CSRs (vl, vtype, and vlenb) can be written to at any time is probably also a bad idea. This is probably done to initialize VLEN. But if that's the case then there needs to be a lock-down mechanism, that ensures correct behavior when writing read-only CSRs.

This function get_num_elem is to calculate the total number of elements that need to be updated, including prestart, active, inactive, and tail elements. And the identification of active elements uses the init_masked_result function.

The logic in our implementation is that we first use get_num_elem to determine the total number of elements to be processed. After knowing how many elements need to be read, we use read_vreg to read source and destination vector register groups as Sail vectors.

After that, the init_masked_result function will check prestart/active/inactive/tail elements and update prestart/inactive/tail elements according to the rule. The function then returns the result vector (with prestart/inactive/tail elements updated and active elements not updated) and the corresponding index information of the active elements (a vector with boolean elements). Then we can update the active elements according to the active index vector.

So the purpose of get_num_elem is not to calculate the number of active elements, and the identification of active element indexes is in another function. If there is inconvenience when using these functions, we can have some discussions and maybe add some new helper functions, but we actually have dealt with different element types in our implementation.

For the vsetivli issue, I'll check that as soon as possible.

Since that is the case with init_masked_result, is it wise if someone directly writes to a vector register through write_single_element function?

XinlaiWan commented 1 year ago

Noted, I suggest that until then, a comment in the riscv_vlen.sail file should exist, specifying that it should be manually changed in for now.

Thanks! I'll add a comment in the next update to make it clearer.

XinlaiWan commented 1 year ago

Since that is the case with init_masked_result, is it wise if someone directly writes to a vector register through write_single_element function?

In the current implementation, write_single_element assumes that the index passed to it belongs to an active element allowed to be written. It's used in vector load instructions and the status of the destination element need to be checked before calling it.