OpenXiangShan / XiangShan

Open-source high-performance RISC-V processor
Other
4.45k stars 619 forks source link

VPU: new vcompress to fit v0&vl split; fix vfredsum/min/max #3053

Closed lewislzh closed 2 weeks ago

lewislzh commented 3 weeks ago

fix vfredsum/max/min: When the vector vfredusum/max/min consists entirely of inactive elements and vs1[0] is NaN, the result should be vs1[0] When both elements of vfredusum are inactive, the temporary result changes from positive zero to negative zero. nes vcompress to fit v0/vl split: The vcompress calculation combines the ones_sum result with vs1 using a temporary register to reduce one read operation. Additionally, other uops, except ones_sum, reduce the basemask calculation and the right shift basemask operation. fix vpermtest to fit new vcompress

XiangShanRobot commented 2 weeks ago
[Generated by IPC robot] commit: da57bb3fd6bd2c42dca067d3e82ff6ef0392a620 commit astar copy_and_run coremark gcc gromacs lbm linux mcf microbench milc namd povray wrf xalancbmk
da57bb3 1.827 0.448 2.048 1.190 2.938 2.508 2.289 0.932 1.369 1.410 3.445 2.672 2.398 2.932
master branch: commit astar copy_and_run coremark gcc gromacs lbm linux mcf microbench milc namd povray wrf xalancbmk
ef14270 1.827 0.448 2.048 1.190 2.938 2.508 2.289 0.932 1.369 1.410 3.445 2.672 2.398 2.932
5c5f442 1.827 0.448 2.048 1.190 2.938 2.508 2.289 0.932 1.369 1.410 3.445 2.672 2.398 2.932
d67c873 1.827 0.448 2.048 1.190 2.938 2.508 2.289 0.932 1.369 1.410 3.445 2.672 2.398 2.932
00f9d18 1.809 0.448 2.060 1.191 2.938 2.508 2.290 0.932 1.419 1.338 3.431 2.642 2.398 2.932
3b94d5d 1.809 0.448 2.060 1.191 2.938 2.508 2.290 0.932 1.419 1.338 3.431 2.660 2.398 2.932
2f6c010 1.820 0.448 2.060 1.199 2.938 2.508 2.290 0.932 1.419 1.338 3.427 2.651 2.398 2.932
0f42355 1.820 0.448 2.060 1.199 2.938 2.508 2.290 0.932 1.419 1.338 3.427 2.651 2.398 2.932
95e6033 1.820 0.448 2.060 1.199 2.938 2.508 2.290 0.932 1.419 1.338 3.427 2.651 2.398 2.932
58cb1b0 1.822 0.448 2.060 1.180 2.944 2.503 2.291 0.932 1.419 1.328 3.437 2.644 2.399 2.931
202ef6b 1.815 0.448 2.060 1.182 2.953 2.504 2.291 0.930 1.403 1.319 3.426 2.660 2.397 2.940