UoB-HPC / SimEng

The University of Bristol HPC Simulation Engine
https://uob-hpc.github.io/SimEng
Apache License 2.0
93 stars 20 forks source link

ACFL23 Instruction Support #425

Open JosephMoore25 opened 2 months ago

JosephMoore25 commented 2 months ago

Merging work done towards enabling support for a few codes for ACFL 23, namely STREAM, Minibude, Cloverleaf, Tealeaf, and Minisweep.

This PR is mostly made up of added instruction support. 58 instructions have been added, with 24 unique instructions with the remainder being variants. Most instructions are SVE, with some NEON added.

An additional feature of "infinite loop checking" has been added. This adds a counter in the ROB which throws an error if the same address has been at the head of the ROB for a very long time. This catches a few errors previously found where an erroneous config or broken logic can cause SimEng to get caught in a loop and sometimes eventually hit OOM.

This also fixes an OpenMP bug that has previously popped up for ACFL 23 support, work that Jack had done in a separate branch.

Tests are still being added, and the new group tests need to be added for all instructions. The PR will leave draft stage once all tests have been added.

Here are a list of instructions added:

Opcode Inst Format General Test added? Group Test added?
Opcode::AArch64_UADDLVv8i8v: { // uaddlv hd, vn.8b Yes
Opcode::AArch64_FTSMUL_ZZZ_S: { // ftsmul zd.s, zn.s, zm.s Yes
Opcode::AArch64_FTSMUL_ZZZ_D: { // ftsmul zd.d, zn.d, zm.d Yes
Opcode::AArch64_FTSSEL_ZZZ_S: { // ftssel zd.s, zn.s, zm.s Yes
Opcode::AArch64_FTSSEL_ZZZ_D: { // ftssel zd.d, zn.d, zm.d Yes
Opcode::AArch64_FTMAD_ZZI_S: { // ftmad zd.s, zn.s, zm.s, #imm Yes
Opcode::AArch64_FTMAD_ZZI_D: { // ftmad zd.s, zn.s, zm.s, #imm Yes
Opcode::AArch64_CMEQv2i32rz: { // cmeq vd.2s, vn.2s, #0 Yes
Opcode::AArch64_CMHIv2i32: { // cmhi vd.2s, vn.2s, vm.2s Yes
Opcode::AArch64_CMPHS_PPzZZ_B: { // cmphs pd.b, pg/z, zn.b, zm.b Yes
Opcode::AArch64_CMPHS_PPzZZ_D: { // cmphs pd.d, pg/z, zn.d, zm.d Yes
Opcode::AArch64_CMPHS_PPzZZ_H: { // cmphs pd.h, pg/z, zn.h, zm.h Yes
Opcode::AArch64_CMPHS_PPzZZ_S: { // cmphs pd.s, pg/z, zn.s, zm.s Yes
Opcode::AArch64_CPY_ZPmV_B: { // cpy zd.b, pg/m, vn.b Yes
Opcode::AArch64_CPY_ZPmV_D: { // cpy zd.d, pg/m, vn.d Yes
Opcode::AArch64_CPY_ZPmV_H: { // cpy zd.h, pg/m, vn.h Yes
Opcode::AArch64_CPY_ZPmV_S: { // cpy zd.s, pg/m, vn.s Yes
Opcode::AArch64_FDIVv4f32: { // fdiv vd.4s, vn.4s, vm.4s Yes
Opcode::AArch64_LASTB_VPZ_D: { // lastb dd, pg, zn.d Yes
Opcode::AArch64_LASTB_VPZ_S: { // lastb sd, pg, zn.s Yes
Opcode::AArch64_LASTB_VPZ_H: { // lastb hd, pg, zn.h Yes
Opcode::AArch64_LASTB_VPZ_B: { // lastb bd, pg, zn.b Yes
Opcode::AArch64_CLASTB_VPZ_D: { // clastb dd, pg, dn, zn.d Yes
Opcode::AArch64_CLASTB_VPZ_S: { // clastb sd, pg, sn, zn.s Yes
Opcode::AArch64_CLASTB_VPZ_H: { // clastb hd, pg, hn, zn.h Yes
Opcode::AArch64_CLASTB_VPZ_B: { // clastb bd, pg, bn, zn.b Yes
Opcode::AArch64_LDAXRB: { // ldaxrb wt, [xn] Yes
Opcode::AArch64_LDRSWroW: { // ldrsw xt, [xn, wm, {extend {#amount}}] Yes
Opcode::AArch64_ORNv8i8: { // orn vd.8b, vn.8b, vn.8b Yes
Opcode::AArch64_PFIRST_B: { // pfirst pdn.b, pg, pdn.b Yes
Opcode::AArch64_PNEXT_B: { // pnext pdn.b, pv, pdn.b Yes
Opcode::AArch64_PNEXT_H: { // pnext pdn.h, pv, pdn.h Yes
Opcode::AArch64_PNEXT_S: { // pnext pdn.s, pv, pdn.s Yes
Opcode::AArch64_PNEXT_D: { // pnext pdn.d, pv, pdn.d Yes
Opcode::AArch64_SMAX_ZI_D: { // smax zdn.d, zdn.d, #imm Yes
Opcode::AArch64_SMAX_ZI_H: { // smax zdn.h, zdn.h, #imm Yes
Opcode::AArch64_SMAX_ZI_B: { // smax zdn.b, zdn.b, #imm Yes
Opcode::AArch64_SMAX_ZPmZ_D: { // smax zd.d, pg/m, zn.d, zm.d Yes
Opcode::AArch64_SMAX_ZPmZ_H: { // smax zd.h, pg/m, zn.h, zm.h Yes
Opcode::AArch64_SMAX_ZPmZ_B: { // smax zd.b, pg/m, zn.b, zm.b Yes
Opcode::AArch64_SMINV_VPZ_D: { // sminv sd, pg, zn.d Yes
Opcode::AArch64_SMINV_VPZ_H: { // sminv sd, pg, zn.h Yes
Opcode::AArch64_SMINV_VPZ_B: { // sminv sd, pg, zn.b Yes
Opcode::AArch64_SMIN_ZPmZ_D: { // smin zd.d, pg/m, zn.d, zm.d Yes
Opcode::AArch64_SMIN_ZPmZ_H: { // smin zd.h, pg/m, zn.h, zm.h Yes
Opcode::AArch64_SMIN_ZPmZ_B: { // smin zd.b, pg/m, zn.b, zm.b Yes
Opcode::AArch64_SPLICE_ZPZ_D: { // splice zdn.d, pv, zdn.t, zm.d Yes
Opcode::AArch64_SPLICE_ZPZ_S: { // splice zdn.s, pv, zdn.t, zm.s Yes
Opcode::AArch64_STLXRB: // stlxrb ws, wt, [xn] Yes
Opcode::AArch64_STLXRH: // stlxrh ws, wt, [xn] Yes
Opcode::AArch64_STLXR: // stlxrb ws, {w,x}t, [xn] Yes
Opcode::AArch64_UMAXVv16i8v: { // umaxv bd, vn.16b Yes
Opcode::AArch64_UMAXVv4i16v: { // umaxv hd, vn.4h Yes
Opcode::AArch64_UMAXVv4i32v: { // umaxv sd, vn.4s Yes
Opcode::AArch64_UMAXVv8i16v: { // umaxv hd, vn.8h Yes
Opcode::AArch64_UMAXVv8i8v: { // umaxv bd, vn.8b Yes
Opcode::AArch64_WHILELS_PXX_B: { // whilels pd.b, xn, xm Yes
Opcode::AArch64_WHILELS_PXX_D: { // whilels pd.d, xn, xm Yes
Opcode::AArch64_WHILELS_PXX_H: { // whilels pd.h, xn, xm Yes
Opcode::AArch64_WHILELS_PXX_S: { // whilels pd.s, xn, xm Yes