cpc / openasip

Open Application-Specific Instruction Set processor tools (OpenASIP)
http://openasip.org
Other
138 stars 41 forks source link

Area, Utilization, Energy Consumption estimation for the RISCV Architecture. #257

Closed KulsumMohammed closed 1 month ago

KulsumMohammed commented 2 months ago

I am trying to implement matrix multiplication for RISCV for the matrix I used. _oacc-riscv -D_DEBUG -O0 -a start.adf --output-format=bin -o matmul256_base_size4.img matmul256_data_base.c matmul256base.c

I have also seen that the multiplication was successful and found the latency values.

I wanted to know the Processor resource utilization and area consumption. In the PDF manual, to estimate the processor, the Cost/Performance Estimator provides estimates of energy consumption, die area, and maximum clock rate estimate command.

estimate {-p [TPEF] -t [TraceDB]} ADF IDF

To use this command, IDF is required. So, in order to generate IDF, I used the command prode start.adf

I have attached a picture of it. image

Then saved the idf file as start_riscarchitecture_2.idf. Then, according to the manual, we need to use generateprocessor command _generateprocessor -i start_riscvarchitecture_2.idf -o riscv-proc_samearrayvalues_noprintf_MAvaluesidf -t start.adf.

However, I am facing errors, The error is, Decoder generator supports the 4-stage transport pipeline of GCU with given options. The given machine has 3 stages. Segmentation fault (core dumped)

I want to know how I can know the area, cost clock for RISCV architecture. Is this the way to find the area, clock, and utilization? If so, can anyone please help with it.

Thank you for your time and for coming forward to help.

karihepola commented 2 months ago

The estimator utilizes information on some standard components (standard FUs, RFs, etc.) for a specific technology node. I am not sure how well it is able to estimate customized RISC-V processors as it does not have information of the hardware complexity of the customized FUs, or of the RISC-V specifc FUs for that matter. For now, the best option is to run your design through the EDA tool flow of your choice to acquire accurate information.

As for the error you are experiencing, you can try opening your adf in prode and adding one to the delay slot parameter of the control unit, it should fix the error you are getting.

kuku464c commented 2 months ago

Upon your suggestion, we solved the issue by changing the delay parameter of the control unit.

As for the estimating utilization, I will then use a different EDA tool,

Thank You for helping out.

kuku464c commented 2 months ago

I am trying to implement matrix multiplication, convolution are others for RISCV architecture. Initially I am trying for the matrix multiplication. I used. oacc-riscv -D_DEBUG -O0 -a start.adf --output-format=bin -o matmul256_base_size4.img matmul256_data_base.c matmul256_base.c I am trying to make a custom operation of a matrix multiplication. Initially I am trying with 4 by 4 matrix. So the matrix will use mul and add operation. Upon reading through the manual pdf, I found that

  1. We can define a custom operation with a DAG
  2. Then can add the new custom operation to your processor architecture as a special instruction
  3. Then, we need to also add it to a RISC-V instruction format.

To define the Custom Operation I used the command osed & Adding a new custom operation MAT_MUL image

image

image

image

I have saved it and move forward to prode start.adf

image

Opening the ALU unit, then Click add from Opset., then Filter MAT_MUL

I am seeing this error. I couldn’t figure out where do I need to change the width size. I am sharing all the info I have hoping that if I give a clear image of what it looks like.

image

Thank you for your time and for coming forward to help.

karihepola commented 2 months ago

The RISC-V customization flow only supports custom operations that can follow the R-format (2 inputs 1 output). Multiply-add has three input operands so it cannot be mapped to this format. Even though multiply-add is somewhat of a special case as you usually want to use the same accumulation register for the target and accumulation operands. If you want to design such an ASIP with OpenASIP, you have to rely on the transport triggered architecture template that has high flexibility in terms of adressing modes.

The error you are getting is due to the function unit only having 2 input ports but you are trying to map an operation with 3 input operands to the function unit. If you add a 3rd input port to the function unit, you should be able to proceed. However, hardware generation or compilation is not supported for the RISC-V target for such custom operations.

KulsumMohammed commented 1 month ago

Thank you for your time and for coming forward to help. So instead of multiplication I wanted to try to see if I can do a custom operation for risc-V. So, I did a new operation named MAXPOOL.

Screenshot from 2024-06-04 12-28-42

Screenshot from 2024-06-04 12-28-16

I was successful with rtl generation as seen below in the riscv-proc_max_mat_O3/vhdl folder Screenshot from 2024-06-04 12-29-30

Now I wanted to see the output, area , utilization and other by running in VIVADO. Can you please help in how to proceed with this in vivado for analysis or any manual or pdf that might be helpful to proceed further

I am sharing all the info I have hoping that if I give a clear image of what it looks like.

Thank you for your time and for coming forward to help.

karihepola commented 1 month ago

If you want to simply evaluate the core, you can just synthesize the tta0.vhdl (core entity) as the top level design to get area and clock frequency estimations from Vivado. Section 3.11.3 in the documentation gives a rough example how to do this even though it is done with for AlmaIF interface but it should be directly applicable to your case if you just select the tta0 as the top level design after importing the riscv-proc_max_mat_O3 directory to the newly created project.