Open baul-iisc opened 1 month ago
I would greatly appreciate it if someone could provide guidance on how to compile code that utilizes these newly added instructions and execute it on Spike pk to characterize the workload.
As I said here:
you can use inline assembly to compile your custom instructions into a RISC-V executable using a RISC-V toolchain that has been extended to support those instructions.
You'll probably need to explain in more detail what exactly you mean by this:
execute it on Spike pk to characterize the workload.
What sort of "characterization" do you mean?
It's also confusing that you mention Spike and Gem5 when your focus seems to be specifically on Spike (I think)?
Here are some examples of unit tests that can be run on spike: https://github.com/riscv-software-src/riscv-tests
I am aiming to characterize some lists of specialized scientific computing workload that comprises a mix of floating-point computations and numerous trigonometric operations. Currently, these trigonometric operations are implemented using a software library. I am interested in determining whether implementing these operations in hardware provides any execution time benefits. Specifically, I want to assess the percentage improvement in performance.
This information is crucial for deciding whether to proceed with the hardware implementation of trigonometric operations. Given the substantial size and complexity of these algorithms, modifying the code to use inline assembly would be a significant undertaking.
Given the substantial size and complexity of these algorithms, modifying the code to use inline assembly would be a significant undertaking.
In that case then you'll probably have to modify the compiler so that it can generate your custom instructions when it detects appropriate intermediate code patterns. This is not a trivial exercise for someone not familiar with compiler toolchain internals. You'll probably need to ask in an appropriate GCC and/or LLVM/Clang (depending on what toolchain you're going to use) forum. There are mailing lists and other discussion forums for these projects that may be relevant to this topic.
Hello team, I am trying to add custom trigonometric instructions on spike riscv-gnu-toolchain to execute my workload and do an assesment of performance improvement with this newly added instructions, but when I gave the command sudo make build-sim, it it throwing errors. PLease suggest if there is any mistake in the following process/ or if I am missing something. THse trigonometric instructions I want use similar to the single and double precision fsqrt instruction. The steps I am following is given below, please let me know where i am making any mistake or missing anything:
DECLARE_INSN(mod, MATCH_MOD, MASK_MOD) DECLARE_INSN(gcd, MATCH_GCD, MASK_GCD) DECLARE_INSN(fact, MATCH_FACT, MASK_FACT) DECLARE_INSN(fsin_s, MATCH_FSIN_S, MASK_FSIN_S) DECLARE_INSN(fcos_s, MATCH_FCOS_S, MASK_FCOS_S) DECLARE_INSN(ftan_s, MATCH_FTAN_S, MASK_FTAN_S) DECLARE_INSN(fsin_d, MATCH_FSIN_D, MASK_FSIN_D) DECLARE_INSN(fcos_d, MATCH_FCOS_D, MASK_FCOS_D) DECLARE_INSN(ftan_d, MATCH_FTAN_D, MASK_FTAN_D)
{"mod", 0, INSN_CLASS_I, "d,s,t", MATCH_MOD, MASK_MOD, match_opcode, 0 }, {"gcd", 0, INSN_CLASS_I, "d,s,t", MATCH_GCD, MASK_GCD, match_opcode, 0 }, {"fact", 0, INSN_CLASS_I, "d,a", MATCH_FACT, MASK_FACT, match_opcode, 0 }, // Single-precision floating-point instruction subset {"fsin.s", 0, INSN_CLASS_F_INX, "D,S", MATCH_FSIN_S|MASK_RM, MASK_FSIN_S|MASK_RM, match_opcode, 0 }, {"fsin.s", 0, INSN_CLASS_F_INX, "D,S,m", MATCH_FSIN_S, MASK_FSIN_S, match_opcode, 0 }, {"fcos.s", 0, INSN_CLASS_F_INX, "D,S", MATCH_FCOS_S|MASK_RM, MASK_FCOS_S|MASK_RM, match_opcode, 0 }, {"fcos.s", 0, INSN_CLASS_F_INX, "D,S,m", MATCH_FCOS_S, MASK_FCOS_S, match_opcode, 0 }, {"ftan.s", 0, INSN_CLASS_F_INX, "D,S", MATCH_FTAN_S|MASK_RM, MASK_FTAN_S|MASK_RM, match_opcode, 0 }, {"ftan.s", 0, INSN_CLASS_F_INX, "D,S,m", MATCH_FTAN_S, MASK_FTAN_S, match_opcode, 0 },
/ Double-precision floating-point instruction subset. / {"fsin.d", 0, INSN_CLASS_D_INX, "D,S", MATCH_FSIN_D|MASK_RM, MASK_FSIN_D|MASK_RM, match_opcode, 0 }, {"fsin.d", 0, INSN_CLASS_D_INX, "D,S,m", MATCH_FSIN_D, MASK_FSIN_D, match_opcode, 0 }, {"fcos.d", 0, INSN_CLASS_D_INX, "D,S", MATCH_FCOS_D|MASK_RM, MASK_FCOS_D|MASK_RM, match_opcode, 0 }, {"fcos.d", 0, INSN_CLASS_D_INX, "D,S,m", MATCH_FCOS_D, MASK_FCOS_D, match_opcode, 0 }, {"ftan.d", 0, INSN_CLASS_D_INX, "D,S", MATCH_FTAN_D|MASK_RM, MASK_FTAN_D|MASK_RM, match_opcode, 0 }, {"ftan.d", 0, INSN_CLASS_D_INX, "D,S,m", MATCH_FTAN_D, MASK_FTAN_D, match_opcode, 0 },
After that , I recompiled my riscv-gnu-toolchain. with the configuration scripts: ./configure --prefix=$RISCV --host=riscv64-unknown-elf --with-arch=rv64gcv --with-abi=lp64d --with-sim=spike --enable-multilib
and build and intall with the command: sudo make -j$(nproc) && sudo make build-sim
To achieve required functionality, I want to simulate these trigonometric instructions in Spike and gcc.So I firstly add MATCH_ins and MASK_ins in spike/riscv/encoding.h,
then I make the files named fsin_s.h, fsin_d.h, fcos_s.h, fcos_d.h, ftan_s.h, ftan_d.h in spike/riscv/insns, I add the following content: //fsin_s.h file with require_either_extension('F', EXT_ZFINX); require_fp; softfloat_roundingMode = RM; WRITE_FRD_F(f32_sin(FRS1_F)); set_fp_exceptions;
//fsin_d.g file with require_either_extension('D', EXT_ZDINX); require_fp; softfloat_roundingMode = RM; WRITE_FRD_D(f64_sin(FRS1_D)); set_fp_exceptions;
//fcos_s.h file with require_either_extension('F', EXT_ZFINX); require_fp; softfloat_roundingMode = RM; WRITE_FRD_F(f32_cos(FRS1_F)); set_fp_exceptions;
//fcos_d.h file with require_either_extension('D', EXT_ZDINX); require_fp; softfloat_roundingMode = RM; WRITE_FRD_D(f64_cos(FRS1_D)); set_fp_exceptions;
//ftan_s.h file with require_either_extension('F', EXT_ZFINX); require_fp; softfloat_roundingMode = RM; WRITE_FRD_F(f32_tan(FRS1_F)); set_fp_exceptions;
//ftan_d.h file with require_either_extension('D', EXT_ZDINX); require_fp; softfloat_roundingMode = RM; WRITE_FRD_D(f64_tan(FRS1_D)); set_fp_exceptions;
riscv_insn_ext_d = \ fsin_d \ fcos_d \ ftan_d \ . . .
if (isa->extension_enabled('F')) { DEFINE_FR1TYPE(fsin_s); DEFINE_FR1TYPE(fcos_s); DEFINE_FR1TYPE(ftan_s); . .
if (isa->extension_enabled(EXT_ZFINX)) { DEFINE_R1TYPE(fsin_s); DEFINE_R1TYPE(fcos_s); DEFINE_R1TYPE(ftan_s); . . .
if (isa->extension_enabled('D')) { DEFINE_FR1TYPE(fsin_d); DEFINE_FR1TYPE(fcos_d); DEFINE_FR1TYPE(ftan_d); . . .
if (isa->extension_enabled(EXT_ZDINX)) { DEFINE_R1TYPE(fsin_d); DEFINE_R1TYPE(fcos_d); DEFINE_R1TYPE(ftan_d); . . .
(define_c_enum "unspec" [ UNSPEC_SIN UNSPEC_COS UNSPEC_TAN // Add other UNSPEC values if needed ])
(define_attr "type" .....fsin,fcos,ftan,.....
(define_insn "sin
(define_insn "cos
(define_insn "tan
Then I have added the following in the gcc/gcc/rtl.def / Trigonometric functions / DEF_RTL_EXPR(SIN, "sin", "e", RTX_UNARY) DEF_RTL_EXPR(COS, "cos", "e", RTX_UNARY) DEF_RTL_EXPR(TAN, "tan", "e", RTX_UNARY)
lastly I added f32_sin.c, f64_sin.c, f32_cos.c, f64_cos.c, f32_tan.c, f64_tan.c to the both directory riscv-gnu-toolchain/pk/softfloat and riscv-gnu-toolchain/spike/softfloat.
then When I tried to build and install the riscv-gnu-toolchain again# then run the config command as follows: ./configure --prefix=$RISCV --host=riscv64-unknown-elf --with-arch=rv64gcv --with-abi=lp64d --with-sim=spike --enable-multilib
sudo make -j$(nproc) && sudo make build-sim
But getting the following errors:
/Data/chandra/development/RISCV-SPIKE/RISCV-BUILD/riscv-gnu-toolchain/spike/riscv/overlap_list.h: In member function ‘void processor_t::register_base_instructions()’:
/Data/chandra/development/RISCV-SPIKE/RISCV-BUILD/riscv-gnu-toolchain/spike/riscv/overlap_list.h:20:22: error: ‘rstsa16_supported’ was not declared in this scope; did you mean ‘xperm16_supported’?
20 | DECLARE_OVERLAP_INSN(rstsa16, EXT_ZPN)
| ^~~
/Data/chandra/development/RISCV-SPIKE/RISCV-BUILD/riscv-gnu-toolchain/spike/riscv/processor.cc:1167:45: note: in definition of macro ‘DECLARE_OVERLAP_INSN’
1167 | #define DECLARE_OVERLAP_INSN(name, ext) { name##_supported = isa->extension_enabled(ext); }
| ^~~~
/Data/chandra/development/RISCV-SPIKE/RISCV-BUILD/riscv-gnu-toolchain/spike/riscv/overlap_list.h:21:22: error: ‘rstsa32_supported’ was not declared in this scope; did you mean ‘xperm32_supported’?
21 | DECLARE_OVERLAP_INSN(rstsa32, EXT_ZPN)
| ^~~
/Data/chandra/development/RISCV-SPIKE/RISCV-BUILD/riscv-gnu-toolchain/spike/riscv/processor.cc:1167:45: note: in definition of macro ‘DECLARE_OVERLAP_INSN’
1167 | #define DECLARE_OVERLAP_INSN(name, ext) { name##_supported = isa->extension_enabled(ext); }
| ^~~~
/Data/chandra/development/RISCV-SPIKE/RISCV-BUILD/riscv-gnu-toolchain/spike/riscv/overlap_list.h:22:22: error: ‘srli32_u_supported’ was not declared in this scope; did you mean ‘vle32_v_supported’?
22 | DECLARE_OVERLAP_INSN(srli32_u, EXT_ZPN)
| ^~~~
/Data/chandra/development/RISCV-SPIKE/RISCV-BUILD/riscv-gnu-toolchain/spike/riscv/processor.cc:1167:45: note: in definition of macro ‘DECLARE_OVERLAP_INSN’
1167 | #define DECLARE_OVERLAP_INSN(name, ext) { name##_supported = isa->extension_enabled(ext); }
| ^~~~
/Data/chandra/development/RISCV-SPIKE/RISCV-BUILD/riscv-gnu-toolchain/spike/riscv/overlap_list.h:23:22: error: ‘umax32_supported’ was not declared in this scope; did you mean ‘maxu_supported’?
23 | DECLARE_OVERLAP_INSN(umax32, EXT_ZPN)
| ^~
/Data/chandra/development/RISCV-SPIKE/RISCV-BUILD/riscv-gnu-toolchain/spike/riscv/processor.cc:1167:45: note: in definition of macro ‘DECLARE_OVERLAP_INSN’
1167 | #define DECLARE_OVERLAP_INSN(name, ext) { name##_supported = isa->extension_enabled(ext); }
| ^~~~
At global scope:
cc1plus: warning: unrecognized command line option ‘-Wno-nonportable-include-path’
make[1]: [Makefile:347: processor.o] Error 1
make[1]: Leaving directory '/Data/chandra/development/RISCV-SPIKE/RISCV-BUILD/riscv-gnu-toolchain/build-spike'
make: [Makefile:901: stamps/build-spike] Error 2
Hello team, I have added the trigonometric instructions on riscv-gnu-toolchain. Now I am specifically looking for the steps required/process to compile my workload in Spike and Gem5 with these newly added instructions and then characterize the workload to evaluate the performance impact of these instructions. I would greatly appreciate it if someone could provide guidance on how to compile code that utilizes these newly added instructions and execute it on Spike pk to characterize the workload. Thank you in advance for your assistance.