1)- Write a C program in text editor as sum 1 to n
Following image will show about the image of the C program
2- Compile the code using GCC compiler and run the code to see the output
Image will show about the commands used and output of sum1ton
1) We will run lab 1 program using RISCV Gcc compiler
2)In a new tab we will check the assembly code for C program we Run
Here we went to the "main" when I calculted using calculator got (35)dec and (23)hex
4) Hence now we will use Ofast on the RISCV compiler and we will use the same C code image which describes about the commands
5) We got 12 instructions(byte adressing) in the main section (adress 100bo)
6) RISCV compilation output Code Compilation using RISC-V GCC compiler with a different optimization flag (Ofast) we will compile the code with compiler flag set as -Ofast and then we will check the assembly code again
1) Debug it in spike using below command and we need our program counter to run till 100b0 by using the below command
2) Next we have to modify the content of a0 below command press enter then it will run the next instruction is lui a0,0*21 And then we will ger the next instruction as addi sp,sp,-16
3)As stack pointer content updated by -16. Quit the spike by pressing q
4)And we will follow the steps as follow from step1(Enter the spike stimulation again)
Hex value 3ffffffb50-3ffffffb40=10 , Dec value 274877905744-274877905728=16
Format: funct7 (7 bits) | rs2 (5 bits) | rs1 (5 bits) | funct3 (3 bits) | rd (5 bits) | opcode (7 bits)
Format: imm[11:0] (12 bits) | rs1 (5 bits) | funct3 (3 bits) | rd (5 bits) | opcode (7 bits)
Format: imm[11:5] (7 bits) | rs2 (5 bits) | rs1 (5 bits) | funct3 (3 bits) | imm[4:0] (5 bits) | opcode (7 bits)
Format: imm[12] (1 bit) | imm[10:5] (6 bits) | rs2 (5 bits) | rs1 (5 bits) | funct3 (3 bits) | imm[4:1] (4 bits) | imm[11] (1 bit) | opcode (7 bits)
Format: imm[31:12] (20 bits) | rd (5 bits) | opcode (7 bits)
Format: imm[20] (1 bit) | imm[10:1] (10 bits) | imm[11] (1 bit) | imm[19:12] (8 bits) | rd (5 bits) | opcode (7 bits)
Type: R
Encoding: 0000000 00010 00001 000 00000 0110011
Binary: 00000000001000001000000000110011
Type: R
Encoding: 0100000 00000 00001 000 00010 0110011
Binary: 01000000000000001000000010110011
Type: R
Encoding: 0000000 00010 00000 111 00001 0110011
Binary: 00000000001000000111000010110011
Type: R
Encoding: 0000000 00101 00001 110 01000 0110011
Binary: 00000000010100001110010000110011
Type: R
Encoding: 0000000 00100 00000 100 01000 0110011
Binary: 00000000010000000100010000110011
Type: R
Encoding: 0000000 00100 00001 010 00000 0110011
Binary: 00000000010000001010000000110011
Type: I
Encoding: 000000000101 00010 000 00010 0010011
Binary: 00000000010100010000000010010011
Type: S
Encoding: 0000000 00010 00000 010 00100 0100011
Binary: 00000000001000000010001000100011
Type: R
Encoding: 0000000 00001 00001 101 00110 0110011
Binary: 00000000000100001101000110110011
Type: B
Encoding: 000000 00000 00000 001 01000 1100011
Binary: 00000000000000000001001000110011
Type: B
Encoding: 000000 00000 00000 000 01111 1100011
Binary: 00000000000000000000001111110011
Type: I
Encoding: 000000000010 00001 010 00011 0000011
Binary: 00000000001000001010000011000011
Type: R
Encoding: 0000000 00001 00001 001 00101 0110011
Binary: 00000000000100001001000101110011
Instruction | Type | 32-bit Binary Representation | Hexadecimal |
---|---|---|---|
ADD r0, r1, r2 |
R | 0000000 00010 00001 000 00000 0110011 |
0x00008033 |
SUB r2, r0, r1 |
R | 0100000 00001 00000 000 00010 0110011 |
0x40000033 |
AND r1, r0, r2 |
R | 0000000 00010 00000 111 00001 0110011 |
0x0000F033 |
OR r8, r1, r5 |
R | 0000000 00101 00001 110 01000 0110011 |
0x0050C333 |
XOR r8, r0, r4 |
R | 0000000 00100 00000 100 01000 0110011 |
0x00408333 |
SLT r0, r1, r4 |
R | 0000000 00100 00001 010 00000 0110011 |
0x00408033 |
ADDI r2, r2, 5 |
I | 000000000101 00010 000 00010 0010011 |
0x00510113 |
SW r2, r0, 4 |
S | 0000000 00010 00000 010 00100 0100011 |
0x00202223 |
SRL r6, r1, r1 |
R | 0000000 00001 00001 101 00110 0110011 |
0x0012D333 |
BNE r0, r0, 20 |
B | 000000 00000 00000 001 00101 1100011 |
0x01400063 |
BEQ r0, r0, 15 |
B | 000000 00000 00000 000 00011 1100011 |
0x00F00063 |
LW r3, r1, 2 |
I | 000000000010 00001 010 00011 0000011 |
0x00212283 |
SLL r5, r1, r1 |
R | 0000000 00001 00001 001 00101 0110011 |
0x00109133 |
Instruction | Type | 32-bit Instruction Code | Binary Representation | Hexadecimal Representation |
---|---|---|---|---|
ADD r0, r1, r2 | R | 0000000 00010 00001 000 00000 0110011 | 00000000001000001000000000110011 | 0x00028233 |
SUB r2, r0, r1 | R | 0100000 00001 00000 000 00010 0110011 | 01000000000100000000000000110011 | 0x40028033 |
AND r1, r0, r2 | R | 0000000 00010 00000 111 00001 0110011 | 00000000001000000111100000110011 | 0x0000f033 |
OR r8, r1, r5 | R | 0000000 00101 00001 110 01000 0110011 | 00000000010100001110010000110011 | 0x0002e433 |
XOR r8, r0, r4 | R | 0000000 00100 00000 100 01000 0110011 | 00000000010000000100100000110011 | 0x00029033 |
SLT r00, r1, r4 | R | 0000000 00100 00001 010 00000 0110011 | 00000000010000001010000000110011 | 0x0002a033 |
ADDI r02, r2, 5 | I | 000000000101 00010 000 00010 0010011 | 0000000001010001000000000010011 | 0x0002a013 |
SW r2, r0, 4 | S | 000000000100 00000 010 00010 0100011 | 00000000010000000001000000100011 | 0x00008023 |
SRL r06, r01, r1 | R | 0000000 00001 00001 101 00110 0110011 | 00000000001000001010100110011001 | 0x0002a033 |
BNE r0, r0, 20 | B | 000000000101 00000 001 00000 1100011 | 00000000010100000001000001100011 | 0x00014063 |
BEQ r0, r0, 15 | B | 000000000111 00000 000 00000 1100011 | 00000000011100000000000001100011 | 0x0000c063 |
LW r03, r01, 2 | I | 000000000010 00001 010 00011 0000011 | 00000000001000001010000010000011 | 0x00010283 |
SLL r05, r01, r1 | R | 0000000 00001 00001 001 00101 0110011 | 00000000001000001001000100110011 | 0x00024033 |
Operation | Standard RISC-V ISA (Hex) | Standard RISC-V ISA (Binary) | Hardcoded ISA (Hex) | Hardcoded ISA (Binary) |
---|---|---|---|---|
ADD | 32'h00008033 | 000000000010 00001 000 00000 0110011 | 32'h02208300 | 00000010001000000 100 00000 01100000 |
SUB | 32'h40000033 | 010000000001 00000 000 00010 0110011 | 32'h02209380 | 00000010001001000 100 10000 01100000 |
AND | 32'h0000f033 | 000000000010 00000 111 00001 0110011 | 32'h0230a400 | 00000010001101000 101 00100 01100000 |
OR | 32'h0050c333 | 000000000101 00001 110 01000 0110011 | 32'h02513480 | 00000010010100010 110 10100 01100000 |
XOR | 32'h00408333 | 000000000100 00000 100 01000 0110011 | 32'h0240c500 | 00000010010000000 100 11000 01100000 |
SLT | 32'h00408033 | 000000000100 00001 010 00000 0110011 | 32'h02415580 | 00000010010000010 101 01010 01100000 |
ADDI | 32'h00510113 | 000000000101 00010 000 00010 0010011 | 32'h00520600 | 00000010010000010 010 00010 01100000 |
SW | 32'h00202223 | 000000000010 00000 010 00100 0100011 | 32'h00209181 | 00000010001000000 010 00001 01100000 |
SRL | 32'h0012d333 | 000000000001 00001 101 00110 0110011 | 32'h00271803 | 00000010001000000 011 00110 01100000 |
BNE | 32'h01400063 | 000000000000 00000 001 00101 1100011 | 32'h01409002 | 00000000000000010 001 00100 11000000 |
BEQ | 32'h00f00063 | 000000000000 00000 000 00011 1100011 | 32'h00f00002 | 00000000000000000 000 00000 11000000 |
LW | 32'h00212283 | 000000000010 00001 010 00011 0000011 | 32'h00208681 | 00000010001000000 010 01100 01100000 |
SLL | 32'h00109133 | 000000000001 00001 001 00101 0110011 | 32'h00208783 | 00000010001000000 111 01111 01100000 |
#include <stdio.h>
// Function to check the parity of a number
void check_parity(int number) {
if (number & 1) {
printf("The number %d has odd parity.\n", number);
} else {
printf("The number %d has even parity.\n", number);
}
}
int main() {
int number;
// Prompt the user to enter a number
printf("Enter an integer: ");
scanf("%d", &number);
// Check and print the parity
check_parity(number);
return 0;
}
output of the parity checker C program using RISCV GCC compiler
Hence,we got same output for both GCC and RISCV compilers.
In this particular Task we will use Makerchip where we can design and stimulation of digital ciruits using TL verilog, with out installing any additional software.
It is modern hardware description language designed to simplify and accelerate digital design by reducing complicity we were facing in verilog and VHDL.
A combinational calculator using TL verilog code on Makerchip. Screenshot of the implementation of the basic combinational circuit in makerchip.
Screenshot of the implementation of Sequential Calculator on makerchip.
It is a technique used in digital system design to improve the efficiency of the process by dividing complex tasks into smaller and sequential stages. Every stage performs a specific operation on the data and these stages are arranged in a pipeline. In a pipeline architecture, the processing of an instruction is divided into several stages. This allows for overlapping the execution of multiple instructions, reducing the overall time needed to complete a sequential of tasks. Hence, circuit can be operated in higher frequencies.
Screen shot shows the implementation of the pipelined logic in makerchip.
Validity is used to track the state and timing of transactions within a design description. In TL-verilog transactions are used to represent higher level actions that occur in a design. Validity refers to whether a transaction is considered valid or invalid based on its state. Validity provides easier debug,cleaner design , error checking , automated clock.
Below is the screen shot of the 2-cycle calculator with validity:
TL verilog code:
|calc
@0
$reset = *reset;
$clk_nit = *clk;
@1
$valid = $reset ? 0 : >>1$valid + 1;
$valid_or_reset = $valid || $reset;
?$valid_or_reset
@1
$val1[31:0] = >>2$out[31:0];
$val2[31:0] = $rand2[3:0];
$sel[2:0] = $rand3[2:0];
$sum[31:0] = $val1[31:0] + $val2[31:0];
$diff[31:0] = $val1[31:0] - $val2[31:0];
$prod[31:0] = $val1[31:0] * $val2[31:0];
$quot[31:0] = $val1[31:0] / $val2[31:0];
@2
$mem[31:0] = $reset ? '0 : ($sel == 3'd5) ? >>2$out[31:0]
: >>2$mem[31:0];
$recall[31:0] = >>2$mem[31:0];
$out[31:0] = $reset ? '0 : ($sel == 3'd0) ? $sum[31:0]
: ($sel == 3'd1) ? $diff[31:0]
: ($sel == 3'd2) ? $prod[31:0]
: ($sel == 3'd3) ? $quot[31:0]
: ($sel == 3'd4) ? $recall[31:0]
: '0;
Below is the screenshot of the implementation of code in makerchip
\m4_TLV_version 1d: tl-x.org
\SV
// This code can be found in: https://github.com/stevehoover/RISC-V_MYTH_Workshop
m4_include_lib(['https://raw.githubusercontent.com/BalaDhinesh/RISC-V_MYTH_Workshop/master/tlv_lib/risc-v_shell_lib.tlv'])
\SV
m4_makerchip_module // (Expanded in Nav-TLV pane.)
\TLV
// /====================\
// | Sum 1 to 9 Program |
// \====================/
//
// Program for MYTH Workshop to test RV32I
// Add 1,2,3,...,9 (in that order).
//
// Regs:
// r10 (a0): In: 0, Out: final sum
// r12 (a2): 10
// r13 (a3): 1..10
// r14 (a4): Sum
//
// External to function:
m4_asm(ADD, r10, r0, r0) // Initialize r10 (a0) to 0.
// Function:
m4_asm(ADD, r14, r10, r0) // Initialize sum register a4 with 0x0
m4_asm(ADDI, r12, r10, 1010) // Store count of 10 in register a2.
m4_asm(ADD, r13, r10, r0) // Initialize intermediate sum register a3 with 0
// Loop:
m4_asm(ADD, r14, r13, r14) // Incremental addition
m4_asm(ADDI, r13, r13, 1) // Increment intermediate register by 1
m4_asm(BLT, r13, r12, 1111111111000) // If a3 is less than a2, branch to label named <loop>
m4_asm(ADD, r10, r14, r0) // Store final result to register a0 so that it can be read by main program
// Optional:
// m4_asm(JAL, r7, 00000000000000000000) // Done. Jump to itself (infinite loop). (Up to 20-bit signed immediate plus implicit 0 bit (unlike JALR) provides byte address; last immediate bit should also be 0)
m4_define_hier(['M4_IMEM'], M4_NUM_INSTRS)
|cpu
@0
$reset = *reset;
// YOUR CODE HERE
// ...
// Note: Because of the magic we are using for visualisation, if visualisation is enabled below,
// be sure to avoid having unassigned signals (which you might be using for random inputs)
// other than those specifically expected in the labs. You'll get strange errors for these.
// Assert these to end simulation (before Makerchip cycle limit).
*passed = *cyc_cnt > 40;
*failed = 1'b0;
// Macro instantiations for:
// o instruction memory
// o register file
// o data memory
// o CPU visualization
|cpu
//m4+imem(@1) // Args: (read stage)
//m4+rf(@1, @1) // Args: (read stage, write stage) - if equal, no register bypass is required
//m4+dmem(@4) // Args: (read/write stage)
//m4+cpu_viz(@4) // For visualisation, argument should be at least equal to the last stage of CPU logic. @4 would work for all labs.
\SV
endmodule
In the stage of fetch CPU fetches the next instruction to be executed from instruction memory, The address where the instruction will be fetched is given by program counter. The program counter is implemented according to the condition,Fetching instruction from the instruction memory is given by
|cpu
@0
$reset = *reset;
$pc[31:0] = $reset ? '0 : >>1$pc + 32'd4;
$imem_rd_en = !$reset ? 1 : 0;
$imem_rd_addr[31:0] = $pc[M4_IMEM_INDEX_CNT+1:2];
@1
$instr[31:0] = $imem_rd_data[31:0];
Value of the PC will be fed as input to instruction memory to be fetched the instruction from particular address location. Screenshot of implementation of the fetch logic in Makerchip
Code for all above decode logics labs
$is_i_instr = $inst[6:2] ==? 5'b0000x ||
$inst[6:2] ==? 5'b001x0 ||
$inst[6:2] ==? 5'b11001;
$is_u_instr = $inst[6:2] ==? 5'b0x101;
$is_r_instr = $inst[6:2] ==? 5'b01011 ||
$inst[6:2] ==? 5'b011x0 ||
$inst[6:2] ==? 5'b10100;
$is_b_instr = $inst[6:2] ==? 5'b11000;
$is_j_instr = $inst[6:2] ==? 5'b11011;
$is_s_instr = $inst[6:2] ==? 5'b0100x;
$imm[31:0] = $is_i_instr ? {{21{$inst[31]}}, $inst[30:20]} :
$is_s_instr ? {{21{$inst[31]}}, $inst[30:25], $inst[11:8], $inst[7]} :
$is_b_instr ? {{20{$inst[31]}}, $inst[7], $inst[30:25], $inst[11:8], 1'b0} :
$is_u_instr ? {$inst[31], $inst[30:20], $inst[19:12], 12'b0} :
$is_j_instr ? {{12{$inst[31]}}, $inst[19:12], $inst[20], $inst[30:21], 1'b0} :
32'b0;
$rs1_use = $is_r_instr || $is_i_instr || $is_s_instr || $is_b_instr;
?$rs1_use
$rs1[4:0] = $inst[19:15];
$rs2_use = $is_r_instr || $is_s_instr || $is_b_instr;
?$rs2_use
$rs2[4:0] = $inst[24:20];
$funct3_use = $is_r_instr || $is_i_instr || $is_s_instr || $is_b_instr;
?$funct3_use
$funct3[2:0] = $inst[14:12];
$funct7_use = $is_r_instr ;
?$funct7_use
$funct7[6:0] = $inst[31:25];
$rd_use = $is_r_instr || $is_i_instr || $is_u_instr || $is_j_instr;
?$rd_use
$rd[4:0] = $inst[11:7];
$opcode[6:0] = $inst;
Screenshot for Lab for Instruction type Decode logic ,Lab for Instruction immediate decoding,Lab for Instruction Field Decode logic
code for above decode logic
$dec_bits [10:0] = {$funct7[5], $funct3, $opcode};
$is_add = $dec_bits ==? 11'b0_000_0110011;
$is_addi = $dec_bits ==? 11'bx_000_0010011;
$is_beq = $dec_bits ==? 11'bx_000_1100011;
$is_bne = $dec_bits ==? 11'bx_001_1100011;
$is_blt = $dec_bits ==? 11'bx_100_1100011;
$is_bge = $dec_bits ==? 11'bx_101_1100011;
$is_bltu = $dec_bits ==? 11'bx_110_1100011;
$is_bgeu = $dec_bits ==? 11'bx_111_1100011;
Code for the Register read
$rf_rd_en1 = $rs1_use;
$rf_rd_index1[4:0] = $rs1;
$rf_rd_en2 = $rs2_use;
$rf_rd_index2[4:0] = $rs2;
$src1_value[31:0] = $rf_rd_data1;
$src2_value[31:0] = $rf_rd_data2;
code for the above lab
$result[31:0] = $is_addi ? $src1_value + $imm :
$is_add ? $src1_value + $src2_value :
32'bx ;
code for the above lab
$rf_wr_en = $rd_use;
$rf_wr_index[4:0] = $rd;
$rf_wr_data[31:0] = $rd == 0 ? 0 : $result;
Screen shot below shows about Register read, ALU operation on ADD,ADDI,Register file write
code for the above lab
$taken_branch = $is_beq ? ($src1_value == $src2_value):
$is_bne ? ($src1_value != $src2_value):
$is_blt ? (($src1_value < $src2_value)^($src1_value[31] != $src2_value[31])):
$is_bge ? (($src1_value >= $src2_value)^($src1_value[31] != $src2_value[31])):
$is_bltu ? ($src1_value < $src2_value):
$is_bgeu ? ($src1_value >= $src2_value):
1'b0;
`BOGUS_USE($taken_branch)
$br_tgt_pc[31:0] = $pc + $imm;
*passed = |cpu/xreg[10]>>5$value == (1+2+3+4+5+6+7+8+9) ;
//\m4_TLV_version 1d: tl-x.org \SV // This code can be found in: https://github.com/stevehoover/RISC-V_MYTH_Workshop m4_include_lib(['https://raw.githubusercontent.com/BalaDhinesh/RISC-V_MYTH_Workshop/master/tlv_lib/risc-v_shell_lib.tlv']) \SV m4_makerchip_module // (Expanded in Nav-TLV pane.) \TLV // /====================\ // | Sum 1 to 9 Program | // \====================/ // // Program for MYTH Workshop to test RV32I // Add 1,2,3,...,9 (in that order). // // Regs: // r10 (a0): In: 0, Out: final sum // r12 (a2): 10 // r13 (a3): 1..10 // r14 (a4): Sum // // External to function: m4_asm(ADD, r10, r0, r0) // Initialize r10 (a0) to 0. // Function: m4_asm(ADD, r14, r10, r0) // Initialize sum register a4 with 0x0 m4_asm(ADDI, r12, r10, 1010) // Store count of 10 in register a2. m4_asm(ADD, r13, r10, r0) // Initialize intermediate sum register a3 with 0 // Loop: m4_asm(ADD, r14, r13, r14) // Incremental addition m4_asm(ADDI, r13, r13, 1) // Increment intermediate register by 1 m4_asm(BLT, r13, r12, 1111111111000) // If a3 is less than a2, branch to label namedm4_asm(ADD, r10, r14, r0) // Store final result to register a0 so that it can be read by main program // Optional: // m4_asm(JAL, r7, 00000000000000000000) // Done. Jump to itself (infinite loop). (Up to 20-bit signed immediate plus implicit 0 bit (unlike JALR) provides byte address; last immediate bit should also be 0) m4_define_hier(['M4_IMEM'], M4_NUM_INSTRS) |cpu @0 $reset = *reset; $clk_nitheesh = *clk; $start = >>1$reset ? !$reset ? '1 :'0 :'0; //$valid = $reset ? '0 : $start ? '1 : >>3$valid ? '1 : '0; $pc[31:0] = (>>1$reset) ? '0 : (>>3$valid_taken_br) ? >>3$br_tgt_pc : >>1$inc_pc; $imem_rd_en = !$reset; $imem_rd_addr[31:0] = $pc[M4_IMEM_INDEX_CNT+1:2]; @1 $inc_pc[31:0] = $pc[31:0] + 32'd4; $instr[31:0] = $imem_rd_data[31:0]; $is_i_instr = $instr[6:2] ==? 5'b0000x || $instr[6:2] ==? 5'b001x0 || $instr[6:2] == 5'b11001; $is_r_instr = $instr[6:2] == 5'b01011 || $instr[6:2] ==? 5'b011x0 || $instr[6:2] == 5'b10100; $is_s_instr = $instr[6:2] ==? 5'b0100x; $is_b_instr = $instr[6:2] == 5'b11000; $is_j_instr = $instr[6:2] == 5'b11011; $is_u_instr = $instr[6:2] ==? 5'b0x101; $imm[31:0] = $is_i_instr ? {{21{$instr[31]}}, $instr[30:20]} : $is_s_instr ? {{21{$instr[31]}}, $instr[30:25], $instr[11:7]} : $is_b_instr ? {{20{$instr[31]}}, $instr[7], $instr[30:25], $instr[11:8], 1'b0}: $is_u_instr ? {$instr[31:12], 12'b0} : $is_j_instr ? {{12{$instr[31]}}, $instr[19:12], $instr[20], $instr[30:21], 1'b0} : 32'b0; $rs2_valid = $is_r_instr || $is_s_instr || $is_b_instr; $rs1_valid = $is_r_instr || $is_s_instr || $is_b_instr || $is_i_instr; $rd_valid = $is_r_instr || $is_i_instr || $is_u_instr || $is_j_instr; $funct3_valid = $is_r_instr || $is_s_instr || $is_b_instr || $is_i_instr; $funct7_valid = $is_r_instr; ?$rs2_valid $rs2[4:0] = $instr[24:20]; ?$rs1_valid $rs1[4:0] = $instr[19:15]; ?$rd_valid $rd[4:0] = $instr[11:7]; ?$funct3_valid $funct3[2:0] = $instr[14:12]; ?$funct7_valid $funct7[6:0] = $instr[31:25]; $opcode[6:0] = $instr[6:0]; $dec_bits[10:0] = {$funct7[5],$funct3,$opcode}; // Branch Instruction $is_beq = $dec_bits ==? 11'bx_000_1100011; $is_bne = $dec_bits ==? 11'bx_001_1100011; $is_blt = $dec_bits ==? 11'bx_100_1100011; $is_bge = $dec_bits ==? 11'bx_101_1100011; $is_bltu = $dec_bits ==? 11'bx_110_1100011; $is_bgeu = $dec_bits ==? 11'bx_111_1100011; // Arithmetic Instruction $is_add = $dec_bits ==? 11'b0_000_0110011; $is_addi = $dec_bits ==? 11'bx_000_0010011; $is_or = $dec_bits ==? 11'b0_110_0110011; $is_ori = $dec_bits ==? 11'bx_110_0010011; $is_xor = $dec_bits ==? 11'b0_100_0110011; $is_xori = $dec_bits ==? 11'bx_100_0010011; $is_and = $dec_bits ==? 11'b0_111_0110011; $is_andi = $dec_bits ==? 11'bx_111_0010011; $is_sub = $dec_bits ==? 11'b1_000_0110011; $is_slti = $dec_bits ==? 11'bx_010_0010011; $is_sltiu = $dec_bits ==? 11'bx_011_0010011; $is_slli = $dec_bits ==? 11'b0_001_0010011; $is_srli = $dec_bits ==? 11'b0_101_0010011; $is_srai = $dec_bits ==? 11'b1_101_0010011; $is_sll = $dec_bits ==? 11'b0_001_0110011; $is_slt = $dec_bits ==? 11'b0_010_0110011; $is_sltu = $dec_bits ==? 11'b0_011_0110011; $is_srl = $dec_bits ==? 11'b0_101_0110011; $is_sra = $dec_bits ==? 11'b1_101_0110011; // Load Instruction $is_load = $dec_bits ==? 11'bx_xxx_0000011; // Store Instruction $is_sb = $dec_bits ==? 11'bx_000_0100011; $is_sh = $dec_bits ==? 11'bx_001_0100011; $is_sw = $dec_bits ==? 11'bx_010_0100011; // Jump Instruction $lui = $dec_bits ==? 11'bx_xxx_0110111; $auipc = $dec_bits ==? 11'bx_xxx_0010111; $jal = $dec_bits ==? 11'bx_xxx_1101111; $jalr = $dec_bits ==? 11'bx_000_1100111; @2 $br_tgt_pc[31:0] = $pc + $imm; // Register File Read Logic $rf_rd_en1 = $rs1_valid; ?$rf_rd_en1 $rf_rd_index1[4:0] = $rs1[4:0]; $rf_rd_en2 = $rs2_valid; ?$rf_rd_en2 $rf_rd_index2[4:0] = $rs2[4:0]; $src1_value[31:0] = ((>>1$rd == $rs1) && (>>1$rf_wr_en ==1'b1)) ? >>1$result : $rf_rd_data1[31:0]; $src2_value[31:0] = ((>>1$rd == $rs2) && (>>1$rf_wr_en ==1'b1)) ? >>1$result : $rf_rd_data2[31:0]; @3 //ALU $sltu_result = $src1_value < $src2_value ; $sltiu_result = $src1_value < $imm ; $result[31:0] = $is_addi ? $src1_value + $imm : $is_add ? $src1_value + $src2_value : $is_or ? $src1_value | $src2_value : $is_ori ? $src1_value | $imm : $is_xor ? $src1_value ^ $src2_value : $is_xori ? $src1_value ^ $imm : $is_and ? $src1_value & $src2_value : $is_andi ? $src1_value & $imm : $is_sub ? $src1_value - $src2_value : $is_slti ? (($src1_value[31] == $imm[31]) ? $sltiu_result : {31'b0,$src1_value[31]}) : $is_sltiu ? $sltiu_result : $is_slli ? $src1_value << $imm[5:0] : $is_srli ? $src1_value >> $imm[5:0] : $is_srai ? ({{32{$src1_value[31]}}, $src1_value} >> $imm[4:0]) : $is_sll ? $src1_value << $src2_value[4:0] : $is_slt ? (($src1_value[31] == $src2_value[31]) ? $sltu_result : {31'b0,$src1_value[31]}) : $is_sltu ? $sltu_result : $is_srl ? $src1_value >> $src2_value[5:0] : $is_sra ? ({{32{$src1_value[31]}}, $src1_value} >> $src2_value[4:0]) : $lui ? ({$imm[31:12], 12'b0}) : $auipc ? $pc + $imm : $jal ? $pc + 4 : $jalr ? $pc + 4 : 32'bx; // Register File Write $rf_wr_en = $valid ? ($rd == 5'b0) ? 1'b0 : $rd_valid : 1'b0; ?$rf_wr_en $rf_wr_index[4:0] = $rd[4:0]; $rf_wr_data[31:0] = $result[31:0]; //Branch Instructions $taken_br = $is_beq ? ($src1_value == $src2_value) : $is_bne ? ($src1_value != $src2_value) : $is_blt ? (($src1_value < $src2_value) ^ ($src1_value[31] != $src2_value[31])) : $is_bge ? (($src1_value >= $src2_value) ^ ($src1_value[31] != $src2_value[31])) : $is_bltu ? ($src1_value < $src2_value) : $is_bgeu ? ($src1_value >= $src2_value) : 1'b0; $valid_taken_br = $valid && $taken_br; $valid = !(>>1$valid_taken_br || >>2$valid_taken_br); //`BOGUS_USE($is_beq $is_bne $is_blt $is_bge $is_bltu $is_bgeu) // Note: Because of the magic we are using for visualisation, if visualisation is enabled below, // be sure to avoid having unassigned signals (which you might be using for random inputs) // other than those specifically expected in the labs. You'll get strange errors for these. // Assert these to end simulation (before Makerchip cycle limit). //*passed = *cyc_cnt > 40; *passed = |cpu/xreg[10]>>5$value == (1+2+3+4+5+6+7+8+9); *failed = 1'b0; // Macro instantiations for: // o instruction memory // o register file // o data memory // o CPU visualization |cpu m4+imem(@1) // Args: (read stage) m4+rf(@2, @3) // Args: (read stage, write stage) - if equal, no register bypass is required //m4+dmem(@4) // Args: (read/write stage) m4+cpu_viz(@4) // For visualisation, argument should be at least equal to the last stage of CPU logic. @4 would work for all labs. \SV endmodule
Screen shot of the pipelined cpu
it took 56 cycles for executing the program from 1to9
\m4_TLV_version 1d: tl-x.org \SV // This code can be found in: https://github.com/stevehoover/RISC-V_MYTH_Workshop m4_include_lib(['https://raw.githubusercontent.com/BalaDhinesh/RISC-V_MYTH_Workshop/master/tlv_lib/risc-v_shell_lib.tlv']) \SV m4_makerchip_module // (Expanded in Nav-TLV pane.) \TLV // /====================\ // | Sum 1 to 9 Program | // \====================/ // // Program for MYTH Workshop to test RV32I // Add 1,2,3,...,9 (in that order). // // Regs: // r10 (a0): In: 0, Out: final sum // r12 (a2): 10 // r13 (a3): 1..10 // r14 (a4): Sum // // External to function: m4_asm(ADD, r10, r0, r0) // Initialize r10 (a0) to 0. // Function: m4_asm(ADD, r14, r10, r0) // Initialize sum register a4 with 0x0 m4_asm(ADDI, r12, r10, 1010) // Store count of 10 in register a2. m4_asm(ADD, r13, r10, r0) // Initialize intermediate sum register a3 with 0 // Loop: m4_asm(ADD, r14, r13, r14) // Incremental addition m4_asm(ADDI, r13, r13, 1) // Increment intermediate register by 1 m4_asm(BLT, r13, r12, 1111111111000) // If a3 is less than a2, branch to label namedm4_asm(ADD, r10, r14, r0) // Store final result to register a0 so that it can be read by main program m4_asm(SW, r0, r10, 100) m4_asm(LW, r15, r0, 100) // Optional: // m4_asm(JAL, r7, 00000000000000000000) // Done. Jump to itself (infinite loop). (Up to 20-bit signed immediate plus implicit 0 bit (unlike JALR) provides byte address; last immediate bit should also be 0) m4_define_hier(['M4_IMEM'], M4_NUM_INSTRS) |cpu @0 $reset = *reset; $clk_nitheesh = *clk; $start = >>1$reset ? !$reset ? '1 :'0 :'0; //$valid = $reset ? '0 : $start ? '1 : >>3$valid ? '1 : '0; $pc[31:0] = (>>1$reset) ? '0 : (>>3$valid_taken_br) ? >>3$br_tgt_pc : (>>3$is_load) ? >>3$inc_pc : >>1$inc_pc; $imem_rd_en = !$reset; $imem_rd_addr[31:0] = $pc[M4_IMEM_INDEX_CNT+1:2]; @1 $inc_pc[31:0] = $pc[31:0] + 32'd4; $instr[31:0] = $imem_rd_data[31:0]; $is_i_instr = $instr[6:2] ==? 5'b0000x || $instr[6:2] ==? 5'b001x0 || $instr[6:2] == 5'b11001; $is_r_instr = $instr[6:2] == 5'b01011 || $instr[6:2] ==? 5'b011x0 || $instr[6:2] == 5'b10100; $is_s_instr = $instr[6:2] ==? 5'b0100x; $is_b_instr = $instr[6:2] == 5'b11000; $is_j_instr = $instr[6:2] == 5'b11011; $is_u_instr = $instr[6:2] ==? 5'b0x101; $imm[31:0] = $is_i_instr ? {{21{$instr[31]}}, $instr[30:20]} : $is_s_instr ? {{21{$instr[31]}}, $instr[30:25], $instr[11:7]} : $is_b_instr ? {{20{$instr[31]}}, $instr[7], $instr[30:25], $instr[11:8], 1'b0}: $is_u_instr ? {$instr[31:12], 12'b0} : $is_j_instr ? {{12{$instr[31]}}, $instr[19:12], $instr[20], $instr[30:21], 1'b0} : 32'b0; $rs2_valid = $is_r_instr || $is_s_instr || $is_b_instr; $rs1_valid = $is_r_instr || $is_s_instr || $is_b_instr || $is_i_instr; $rd_valid = $is_r_instr || $is_i_instr || $is_u_instr || $is_j_instr; $funct3_valid = $is_r_instr || $is_s_instr || $is_b_instr || $is_i_instr; $funct7_valid = $is_r_instr; ?$rs2_valid $rs2[4:0] = $instr[24:20]; ?$rs1_valid $rs1[4:0] = $instr[19:15]; ?$rd_valid $rd[4:0] = $instr[11:7]; ?$funct3_valid $funct3[2:0] = $instr[14:12]; ?$funct7_valid $funct7[6:0] = $instr[31:25]; $opcode[6:0] = $instr[6:0]; $dec_bits[10:0] = {$funct7[5],$funct3,$opcode}; // Branch Instruction $is_beq = $dec_bits ==? 11'bx_000_1100011; $is_bne = $dec_bits ==? 11'bx_001_1100011; $is_blt = $dec_bits ==? 11'bx_100_1100011; $is_bge = $dec_bits ==? 11'bx_101_1100011; $is_bltu = $dec_bits ==? 11'bx_110_1100011; $is_bgeu = $dec_bits ==? 11'bx_111_1100011; // Arithmetic Instruction $is_add = $dec_bits ==? 11'b0_000_0110011; $is_addi = $dec_bits ==? 11'bx_000_0010011; $is_or = $dec_bits ==? 11'b0_110_0110011; $is_ori = $dec_bits ==? 11'bx_110_0010011; $is_xor = $dec_bits ==? 11'b0_100_0110011; $is_xori = $dec_bits ==? 11'bx_100_0010011; $is_and = $dec_bits ==? 11'b0_111_0110011; $is_andi = $dec_bits ==? 11'bx_111_0010011; $is_sub = $dec_bits ==? 11'b1_000_0110011; $is_slti = $dec_bits ==? 11'bx_010_0010011; $is_sltiu = $dec_bits ==? 11'bx_011_0010011; $is_slli = $dec_bits ==? 11'b0_001_0010011; $is_srli = $dec_bits ==? 11'b0_101_0010011; $is_srai = $dec_bits ==? 11'b1_101_0010011; $is_sll = $dec_bits ==? 11'b0_001_0110011; $is_slt = $dec_bits ==? 11'b0_010_0110011; $is_sltu = $dec_bits ==? 11'b0_011_0110011; $is_srl = $dec_bits ==? 11'b0_101_0110011; $is_sra = $dec_bits ==? 11'b1_101_0110011; // Load Instruction $is_load = $dec_bits ==? 11'bx_xxx_0000011; // Store Instruction $is_sb = $dec_bits ==? 11'bx_000_0100011; $is_sh = $dec_bits ==? 11'bx_001_0100011; $is_sw = $dec_bits ==? 11'bx_010_0100011; // Jump Instruction $lui = $dec_bits ==? 11'bx_xxx_0110111; $auipc = $dec_bits ==? 11'bx_xxx_0010111; $jal = $dec_bits ==? 11'bx_xxx_1101111; $jalr = $dec_bits ==? 11'bx_000_1100111; @2 $br_tgt_pc[31:0] = $pc + $imm; // Register File Read Logic $rf_rd_en1 = $rs1_valid; ?$rf_rd_en1 $rf_rd_index1[4:0] = $rs1[4:0]; $rf_rd_en2 = $rs2_valid; ?$rf_rd_en2 $rf_rd_index2[4:0] = $rs2[4:0]; $src1_value[31:0] = ((>>1$rd == $rs1) && (>>1$rf_wr_en ==1'b1)) ? >>1$result : $rf_rd_data1[31:0]; $src2_value[31:0] = ((>>1$rd == $rs2) && (>>1$rf_wr_en ==1'b1)) ? >>1$result : $rf_rd_data2[31:0]; @3 //ALU $sltu_result = $src1_value < $src2_value ; $sltiu_result = $src1_value < $imm ; $result[31:0] = $is_addi ? $src1_value + $imm : $is_add ? $src1_value + $src2_value : $is_or ? $src1_value | $src2_value : $is_ori ? $src1_value | $imm : $is_xor ? $src1_value ^ $src2_value : $is_xori ? $src1_value ^ $imm : $is_and ? $src1_value & $src2_value : $is_andi ? $src1_value & $imm : $is_sub ? $src1_value - $src2_value : $is_slti ? (($src1_value[31] == $imm[31]) ? $sltiu_result : {31'b0,$src1_value[31]}) : $is_sltiu ? $sltiu_result : $is_slli ? $src1_value << $imm[5:0] : $is_srli ? $src1_value >> $imm[5:0] : $is_srai ? ({{32{$src1_value[31]}}, $src1_value} >> $imm[4:0]) : $is_sll ? $src1_value << $src2_value[4:0] : $is_slt ? (($src1_value[31] == $src2_value[31]) ? $sltu_result : {31'b0,$src1_value[31]}) : $is_sltu ? $sltu_result : $is_srl ? $src1_value >> $src2_value[5:0] : $is_sra ? ({{32{$src1_value[31]}}, $src1_value} >> $src2_value[4:0]) : $lui ? ({$imm[31:12], 12'b0}) : $auipc ? $pc + $imm : $jal ? $pc + 4 : $jalr ? $pc + 4 : ($is_load || $is_s_instr) ? $src1_value + $imm : 32'bx; // Register File Write $rf_wr_en = $valid ? ($rd == 5'b0) ? 1'b0 : $rd_valid : >>2$ld_data; ?$rf_wr_en $rf_wr_index[4:0] = !$valid ? >>2$rd[4:0] : $rd[4:0]; $rf_wr_data[31:0] = !$valid ? >>2$ld_data[31:0] : $result[31:0]; //Branch Instructions $taken_br = $is_beq ? ($src1_value == $src2_value) : $is_bne ? ($src1_value != $src2_value) : $is_blt ? (($src1_value < $src2_value) ^ ($src1_value[31] != $src2_value[31])) : $is_bge ? (($src1_value >= $src2_value) ^ ($src1_value[31] != $src2_value[31])) : $is_bltu ? ($src1_value < $src2_value) : $is_bgeu ? ($src1_value >= $src2_value) : 1'b0; $valid_taken_br = $valid && $taken_br; // Load $valid_load = $valid && $is_load; $valid = !(>>1$valid_taken_br || >>2$valid_taken_br || >>1$valid_load || >>2$valid_load); @4 $dmem_rd_en = $valid_load; $dmem_wr_en = $valid && $is_s_instr; $dmem_addr[3:0] = $result[5:2]; $dmem_wr_data[31:0] = $src2_value[31:0]; @5 $ld_data[31:0] = $dmem_rd_data[31:0]; //`BOGUS_USE($is_beq $is_bne $is_blt $is_bge $is_bltu $is_bgeu) // Note: Because of the magic we are using for visualisation, if visualisation is enabled below, // be sure to avoid having unassigned signals (which you might be using for random inputs) // other than those specifically expected in the labs. You'll get strange errors for these. // Assert these to end simulation (before Makerchip cycle limit). //*passed = *cyc_cnt > 40; *passed = |cpu/xreg[15]>>5$value == (1+2+3+4+5+6+7+8+9); *failed = 1'b0; // Macro instantiations for: // o instruction memory // o register file // o data memory // o CPU visualization |cpu m4+imem(@1) // Args: (read stage) m4+rf(@2, @3) // Args: (read stage, write stage) - if equal, no register bypass is required m4+dmem(@4) // Args: (read/write stage) m4+cpu_viz(@4) // For visualisation, argument should be at least equal to the last stage of CPU logic. @4 would work for all labs. \SV endmodule
Screenshot of the load and store operations
Screen shot of the load and store operations
\m4_TLV_version 1d: tl-x.org \SV // This code can be found in: https://github.com/stevehoover/RISC-V_MYTH_Workshop m4_include_lib(['https://raw.githubusercontent.com/BalaDhinesh/RISC-V_MYTH_Workshop/master/tlv_lib/risc-v_shell_lib.tlv']) \SV m4_makerchip_module // (Expanded in Nav-TLV pane.) \TLV // /====================\ // | Sum 1 to 9 Program | // \====================/ // // Program for MYTH Workshop to test RV32I // Add 1,2,3,...,9 (in that order). // // Regs: // r10 (a0): In: 0, Out: final sum // r12 (a2): 10 // r13 (a3): 1..10 // r14 (a4): Sum // // External to function: m4_asm(ADD, r10, r0, r0) // Initialize r10 (a0) to 0. // Function: m4_asm(ADD, r14, r10, r0) // Initialize sum register a4 with 0x0 m4_asm(ADDI, r12, r10, 1010) // Store count of 10 in register a2. m4_asm(ADD, r13, r10, r0) // Initialize intermediate sum register a3 with 0 // Loop: m4_asm(ADD, r14, r13, r14) // Incremental addition m4_asm(ADDI, r13, r13, 1) // Increment intermediate register by 1 m4_asm(BLT, r13, r12, 1111111111000) // If a3 is less than a2, branch to label namedm4_asm(ADD, r10, r14, r0) // Store final result to register a0 so that it can be read by main program m4_asm(SW, r0, r10, 100) m4_asm(LW, r15, r0, 100) // Optional: // m4_asm(JAL, r7, 00000000000000000000) // Done. Jump to itself (infinite loop). (Up to 20-bit signed immediate plus implicit 0 bit (unlike JALR) provides byte address; last immediate bit should also be 0) m4_define_hier(['M4_IMEM'], M4_NUM_INSTRS) |cpu @0 $reset = *reset; $clk_nitheesh = *clk; $start = >>1$reset ? !$reset ? '1 :'0 :'0; //$valid = $reset ? '0 : $start ? '1 : >>3$valid ? '1 : '0; $pc[31:0] = (>>1$reset) ? '0 : (>>3$valid_taken_br) ? >>3$br_tgt_pc : (>>3$is_load) ? >>3$inc_pc : (>>3$valid_jump && >>3$is_jal) ? >>3$br_tgt_pc : (>>3$valid_jump && >>3$is_jalr) ? >>3$jalr_tgt_pc : >>1$inc_pc; $imem_rd_en = !$reset; $imem_rd_addr[31:0] = $pc[M4_IMEM_INDEX_CNT+1:2]; @1 $inc_pc[31:0] = $pc[31:0] + 32'd4; $instr[31:0] = $imem_rd_data[31:0]; $is_i_instr = $instr[6:2] ==? 5'b0000x || $instr[6:2] ==? 5'b001x0 || $instr[6:2] == 5'b11001; $is_r_instr = $instr[6:2] == 5'b01011 || $instr[6:2] ==? 5'b011x0 || $instr[6:2] == 5'b10100; $is_s_instr = $instr[6:2] ==? 5'b0100x; $is_b_instr = $instr[6:2] == 5'b11000; $is_j_instr = $instr[6:2] == 5'b11011; $is_u_instr = $instr[6:2] ==? 5'b0x101; $imm[31:0] = $is_i_instr ? {{21{$instr[31]}}, $instr[30:20]} : $is_s_instr ? {{21{$instr[31]}}, $instr[30:25], $instr[11:7]} : $is_b_instr ? {{20{$instr[31]}}, $instr[7], $instr[30:25], $instr[11:8], 1'b0}: $is_u_instr ? {$instr[31:12], 12'b0} : $is_j_instr ? {{12{$instr[31]}}, $instr[19:12], $instr[20], $instr[30:21], 1'b0} : 32'b0; $rs2_valid = $is_r_instr || $is_s_instr || $is_b_instr; $rs1_valid = $is_r_instr || $is_s_instr || $is_b_instr || $is_i_instr; $rd_valid = $is_r_instr || $is_i_instr || $is_u_instr || $is_j_instr; $funct3_valid = $is_r_instr || $is_s_instr || $is_b_instr || $is_i_instr; $funct7_valid = $is_r_instr; ?$rs2_valid $rs2[4:0] = $instr[24:20]; ?$rs1_valid $rs1[4:0] = $instr[19:15]; ?$rd_valid $rd[4:0] = $instr[11:7]; ?$funct3_valid $funct3[2:0] = $instr[14:12]; ?$funct7_valid $funct7[6:0] = $instr[31:25]; $opcode[6:0] = $instr[6:0]; $dec_bits[10:0] = {$funct7[5],$funct3,$opcode}; // Branch Instruction $is_beq = $dec_bits ==? 11'bx_000_1100011; $is_bne = $dec_bits ==? 11'bx_001_1100011; $is_blt = $dec_bits ==? 11'bx_100_1100011; $is_bge = $dec_bits ==? 11'bx_101_1100011; $is_bltu = $dec_bits ==? 11'bx_110_1100011; $is_bgeu = $dec_bits ==? 11'bx_111_1100011; // Arithmetic Instruction $is_add = $dec_bits ==? 11'b0_000_0110011; $is_addi = $dec_bits ==? 11'bx_000_0010011; $is_or = $dec_bits ==? 11'b0_110_0110011; $is_ori = $dec_bits ==? 11'bx_110_0010011; $is_xor = $dec_bits ==? 11'b0_100_0110011; $is_xori = $dec_bits ==? 11'bx_100_0010011; $is_and = $dec_bits ==? 11'b0_111_0110011; $is_andi = $dec_bits ==? 11'bx_111_0010011; $is_sub = $dec_bits ==? 11'b1_000_0110011; $is_slti = $dec_bits ==? 11'bx_010_0010011; $is_sltiu = $dec_bits ==? 11'bx_011_0010011; $is_slli = $dec_bits ==? 11'b0_001_0010011; $is_srli = $dec_bits ==? 11'b0_101_0010011; $is_srai = $dec_bits ==? 11'b1_101_0010011; $is_sll = $dec_bits ==? 11'b0_001_0110011; $is_slt = $dec_bits ==? 11'b0_010_0110011; $is_sltu = $dec_bits ==? 11'b0_011_0110011; $is_srl = $dec_bits ==? 11'b0_101_0110011; $is_sra = $dec_bits ==? 11'b1_101_0110011; // Load Instruction $is_load = $dec_bits ==? 11'bx_xxx_0000011; // Store Instruction $is_sb = $dec_bits ==? 11'bx_000_0100011; $is_sh = $dec_bits ==? 11'bx_001_0100011; $is_sw = $dec_bits ==? 11'bx_010_0100011; // Jump Instruction $lui = $dec_bits ==? 11'bx_xxx_0110111; $auipc = $dec_bits ==? 11'bx_xxx_0010111; $jal = $dec_bits ==? 11'bx_xxx_1101111; $jalr = $dec_bits ==? 11'bx_000_1100111; $is_jump = $is_jal || $is_jalr; @2 // Branch Target PC $br_tgt_pc[31:0] = $pc + $imm; // Jump Target PC $jalr_tgt_pc[31:0] = $src1_value + $imm; // Register File Read Logic $rf_rd_en1 = $rs1_valid; ?$rf_rd_en1 $rf_rd_index1[4:0] = $rs1[4:0]; $rf_rd_en2 = $rs2_valid; ?$rf_rd_en2 $rf_rd_index2[4:0] = $rs2[4:0]; $src1_value[31:0] = ((>>1$rd == $rs1) && (>>1$rf_wr_en ==1'b1)) ? >>1$result : $rf_rd_data1[31:0]; $src2_value[31:0] = ((>>1$rd == $rs2) && (>>1$rf_wr_en ==1'b1)) ? >>1$result : $rf_rd_data2[31:0]; @3 //ALU $sltu_result = $src1_value < $src2_value ; $sltiu_result = $src1_value < $imm ; $result[31:0] = $is_addi ? $src1_value + $imm : $is_add ? $src1_value + $src2_value : $is_or ? $src1_value | $src2_value : $is_ori ? $src1_value | $imm : $is_xor ? $src1_value ^ $src2_value : $is_xori ? $src1_value ^ $imm : $is_and ? $src1_value & $src2_value : $is_andi ? $src1_value & $imm : $is_sub ? $src1_value - $src2_value : $is_slti ? (($src1_value[31] == $imm[31]) ? $sltiu_result : {31'b0,$src1_value[31]}) : $is_sltiu ? $sltiu_result : $is_slli ? $src1_value << $imm[5:0] : $is_srli ? $src1_value >> $imm[5:0] : $is_srai ? ({{32{$src1_value[31]}}, $src1_value} >> $imm[4:0]) : $is_sll ? $src1_value << $src2_value[4:0] : $is_slt ? (($src1_value[31] == $src2_value[31]) ? $sltu_result : {31'b0,$src1_value[31]}) : $is_sltu ? $sltu_result : $is_srl ? $src1_value >> $src2_value[5:0] : $is_sra ? ({{32{$src1_value[31]}}, $src1_value} >> $src2_value[4:0]) : $lui ? ({$imm[31:12], 12'b0}) : $auipc ? $pc + $imm : $jal ? $pc + 4 : $jalr ? $pc + 4 : ($is_load || $is_s_instr) ? $src1_value + $imm : 32'bx; // Register File Write $rf_wr_en = $valid ? ($rd == 5'b0) ? 1'b0 : $rd_valid : >>2$ld_data; ?$rf_wr_en $rf_wr_index[4:0] = !$valid ? >>2$rd[4:0] : $rd[4:0]; $rf_wr_data[31:0] = !$valid ? >>2$ld_data[31:0] : $result[31:0]; //Branch Instructions $taken_br = $is_beq ? ($src1_value == $src2_value) : $is_bne ? ($src1_value != $src2_value) : $is_blt ? (($src1_value < $src2_value) ^ ($src1_value[31] != $src2_value[31])) : $is_bge ? (($src1_value >= $src2_value) ^ ($src1_value[31] != $src2_value[31])) : $is_bltu ? ($src1_value < $src2_value) : $is_bgeu ? ($src1_value >= $src2_value) : 1'b0; $valid_taken_br = $valid && $taken_br; // Load $valid_load = $valid && $is_load; $valid = !(>>1$valid_taken_br || >>2$valid_taken_br || >>1$valid_load || >>2$valid_load || >>1$valid_jump || >>2$valid_jump); //Jump $valid_jump = $valid && $is_jump; @4 $dmem_rd_en = $valid_load; $dmem_wr_en = $valid && $is_s_instr; $dmem_addr[3:0] = $result[5:2]; $dmem_wr_data[31:0] = $src2_value[31:0]; @5 $ld_data[31:0] = $dmem_rd_data[31:0]; //`BOGUS_USE($is_beq $is_bne $is_blt $is_bge $is_bltu $is_bgeu) // Note: Because of the magic we are using for visualisation, if visualisation is enabled below, // be sure to avoid having unassigned signals (which you might be using for random inputs) // other than those specifically expected in the labs. You'll get strange errors for these. // Assert these to end simulation (before Makerchip cycle limit). //*passed = *cyc_cnt > 40; *passed = |cpu/xreg[15]>>5$value == (1+2+3+4+5+6+7+8+9); *failed = 1'b0; // Macro instantiations for: // o instruction memory // o register file // o data memory // o CPU visualization |cpu m4+imem(@1) // Args: (read stage) m4+rf(@2, @3) // Args: (read stage, write stage) - if equal, no register bypass is required m4+dmem(@4) // Args: (read/write stage) m4+cpu_viz(@4) // For visualisation, argument should be at least equal to the last stage of CPU logic. @4 would work for all labs. \SV endmodule
Final Diagram
Clock signal
Reset Signal
Hence we can see in the below waveform that in r[14] we can see the sum from 1 to 9
Completed in 58 cycles.
Steps to be followed :
Follow the process:
python3-pip git iverilog gtkwave
cd ~
sudo apt-get install python3-venv
python3 -m venv .venv
source ~/.venv/bin/activate
pip3 install pyyaml click sandpiper-saas
$ sudo apt install make python python3 python3-pip git iverilog gtkwave docker.io
$ sudo chmod 666 /var/run/docker.sock
$ cd ~
$ pip3 install pyyaml click sandpiper-saas
git clone https://github.com/manili/VSDBabySoC.git
cd VSDBabySoc
cd /home/vsduser/VSDBabySoC
make pre_synth_sim
Replace the rvmyth.tlv file in VSDBabySoc/src/module folder with RISC-V tlv from maker chip. convert tlv to verilog we will use the code we build in previous lab
\m4_TLV_version 1d: tl-x.org
\SV
// This code can be found in: https://github.com/stevehoover/RISC-V_MYTH_Workshop
m4_include_lib(['https://raw.githubusercontent.com/BalaDhinesh/RISC-V_MYTH_Workshop/master/tlv_lib/risc-v_shell_lib.tlv'])
\SV
m4_makerchip_module // (Expanded in Nav-TLV pane.)
\TLV
// /====================\
// | Sum 1 to 9 Program |
// \====================/
//
// Program for MYTH Workshop to test RV32I
// Add 1,2,3,...,9 (in that order).
//
// Regs:
// r10 (a0): In: 0, Out: final sum
// r12 (a2): 10
// r13 (a3): 1..10
// r14 (a4): Sum
//
// External to function:
m4_asm(ADD, r10, r0, r0) // Initialize r10 (a0) to 0.
// Function:
m4_asm(ADD, r14, r10, r0) // Initialize sum register a4 with 0x0
m4_asm(ADDI, r12, r10, 1010) // Store count of 10 in register a2.
m4_asm(ADD, r13, r10, r0) // Initialize intermediate sum register a3 with 0
// Loop:
m4_asm(ADD, r14, r13, r14) // Incremental addition
m4_asm(ADDI, r13, r13, 1) // Increment intermediate register by 1
m4_asm(BLT, r13, r12, 1111111111000) // If a3 is less than a2, branch to label named <loop>
m4_asm(ADD, r10, r14, r0) // Store final result to register a0 so that it can be read by main program
m4_asm(SW, r0, r10, 100)
m4_asm(LW, r15, r0, 100)
// Optional:
// m4_asm(JAL, r7, 00000000000000000000) // Done. Jump to itself (infinite loop). (Up to 20-bit signed immediate plus implicit 0 bit (unlike JALR) provides byte address; last immediate bit should also be 0)
m4_define_hier(['M4_IMEM'], M4_NUM_INSTRS)
|cpu
@0
$reset = *reset;
$clk_nitheesh = *clk;
$start = >>1$reset ? !$reset ? '1 :'0 :'0;
//$valid = $reset ? '0 : $start ? '1 : >>3$valid ? '1 : '0;
$pc[31:0] = (>>1$reset) ? '0 :
(>>3$valid_taken_br) ? >>3$br_tgt_pc :
(>>3$is_load) ? >>3$inc_pc :
(>>3$valid_jump && >>3$is_jal) ? >>3$br_tgt_pc :
(>>3$valid_jump && >>3$is_jalr) ? >>3$jalr_tgt_pc : >>1$inc_pc;
$imem_rd_en = !$reset;
$imem_rd_addr[31:0] = $pc[M4_IMEM_INDEX_CNT+1:2];
@1
$inc_pc[31:0] = $pc[31:0] + 32'd4;
$instr[31:0] = $imem_rd_data[31:0];
$is_i_instr = $instr[6:2] ==? 5'b0000x ||
$instr[6:2] ==? 5'b001x0 ||
$instr[6:2] == 5'b11001;
$is_r_instr = $instr[6:2] == 5'b01011 ||
$instr[6:2] ==? 5'b011x0 ||
$instr[6:2] == 5'b10100;
$is_s_instr = $instr[6:2] ==? 5'b0100x;
$is_b_instr = $instr[6:2] == 5'b11000;
$is_j_instr = $instr[6:2] == 5'b11011;
$is_u_instr = $instr[6:2] ==? 5'b0x101;
$imm[31:0] = $is_i_instr ? {{21{$instr[31]}}, $instr[30:20]} :
$is_s_instr ? {{21{$instr[31]}}, $instr[30:25], $instr[11:7]} :
$is_b_instr ? {{20{$instr[31]}}, $instr[7], $instr[30:25], $instr[11:8], 1'b0}:
$is_u_instr ? {$instr[31:12], 12'b0} :
$is_j_instr ? {{12{$instr[31]}}, $instr[19:12], $instr[20], $instr[30:21], 1'b0} :
32'b0;
$rs2_valid = $is_r_instr || $is_s_instr || $is_b_instr;
$rs1_valid = $is_r_instr || $is_s_instr || $is_b_instr || $is_i_instr;
$rd_valid = $is_r_instr || $is_i_instr || $is_u_instr || $is_j_instr;
$funct3_valid = $is_r_instr || $is_s_instr || $is_b_instr || $is_i_instr;
$funct7_valid = $is_r_instr;
?$rs2_valid
$rs2[4:0] = $instr[24:20];
?$rs1_valid
$rs1[4:0] = $instr[19:15];
?$rd_valid
$rd[4:0] = $instr[11:7];
?$funct3_valid
$funct3[2:0] = $instr[14:12];
?$funct7_valid
$funct7[6:0] = $instr[31:25];
$opcode[6:0] = $instr[6:0];
$dec_bits[10:0] = {$funct7[5],$funct3,$opcode};
// Branch Instruction
$is_beq = $dec_bits ==? 11'bx_000_1100011;
$is_bne = $dec_bits ==? 11'bx_001_1100011;
$is_blt = $dec_bits ==? 11'bx_100_1100011;
$is_bge = $dec_bits ==? 11'bx_101_1100011;
$is_bltu = $dec_bits ==? 11'bx_110_1100011;
$is_bgeu = $dec_bits ==? 11'bx_111_1100011;
// Arithmetic Instruction
$is_add = $dec_bits ==? 11'b0_000_0110011;
$is_addi = $dec_bits ==? 11'bx_000_0010011;
$is_or = $dec_bits ==? 11'b0_110_0110011;
$is_ori = $dec_bits ==? 11'bx_110_0010011;
$is_xor = $dec_bits ==? 11'b0_100_0110011;
$is_xori = $dec_bits ==? 11'bx_100_0010011;
$is_and = $dec_bits ==? 11'b0_111_0110011;
$is_andi = $dec_bits ==? 11'bx_111_0010011;
$is_sub = $dec_bits ==? 11'b1_000_0110011;
$is_slti = $dec_bits ==? 11'bx_010_0010011;
$is_sltiu = $dec_bits ==? 11'bx_011_0010011;
$is_slli = $dec_bits ==? 11'b0_001_0010011;
$is_srli = $dec_bits ==? 11'b0_101_0010011;
$is_srai = $dec_bits ==? 11'b1_101_0010011;
$is_sll = $dec_bits ==? 11'b0_001_0110011;
$is_slt = $dec_bits ==? 11'b0_010_0110011;
$is_sltu = $dec_bits ==? 11'b0_011_0110011;
$is_srl = $dec_bits ==? 11'b0_101_0110011;
$is_sra = $dec_bits ==? 11'b1_101_0110011;
// Load Instruction
$is_load = $dec_bits ==? 11'bx_xxx_0000011;
// Store Instruction
$is_sb = $dec_bits ==? 11'bx_000_0100011;
$is_sh = $dec_bits ==? 11'bx_001_0100011;
$is_sw = $dec_bits ==? 11'bx_010_0100011;
// Jump Instruction
$lui = $dec_bits ==? 11'bx_xxx_0110111;
$auipc = $dec_bits ==? 11'bx_xxx_0010111;
$jal = $dec_bits ==? 11'bx_xxx_1101111;
$jalr = $dec_bits ==? 11'bx_000_1100111;
$is_jump = $is_jal || $is_jalr;
@2
// Branch Target PC
$br_tgt_pc[31:0] = $pc + $imm;
// Jump Target PC
$jalr_tgt_pc[31:0] = $src1_value + $imm;
// Register File Read Logic
$rf_rd_en1 = $rs1_valid;
?$rf_rd_en1
$rf_rd_index1[4:0] = $rs1[4:0];
$rf_rd_en2 = $rs2_valid;
?$rf_rd_en2
$rf_rd_index2[4:0] = $rs2[4:0];
$src1_value[31:0] = ((>>1$rd == $rs1) && (>>1$rf_wr_en ==1'b1)) ? >>1$result : $rf_rd_data1[31:0];
$src2_value[31:0] = ((>>1$rd == $rs2) && (>>1$rf_wr_en ==1'b1)) ? >>1$result : $rf_rd_data2[31:0];
@3
//ALU
$sltu_result = $src1_value < $src2_value ;
$sltiu_result = $src1_value < $imm ;
$result[31:0] = $is_addi ? $src1_value + $imm :
$is_add ? $src1_value + $src2_value :
$is_or ? $src1_value | $src2_value :
$is_ori ? $src1_value | $imm :
$is_xor ? $src1_value ^ $src2_value :
$is_xori ? $src1_value ^ $imm :
$is_and ? $src1_value & $src2_value :
$is_andi ? $src1_value & $imm :
$is_sub ? $src1_value - $src2_value :
$is_slti ? (($src1_value[31] == $imm[31]) ? $sltiu_result : {31'b0,$src1_value[31]}) :
$is_sltiu ? $sltiu_result :
$is_slli ? $src1_value << $imm[5:0] :
$is_srli ? $src1_value >> $imm[5:0] :
$is_srai ? ({{32{$src1_value[31]}}, $src1_value} >> $imm[4:0]) :
$is_sll ? $src1_value << $src2_value[4:0] :
$is_slt ? (($src1_value[31] == $src2_value[31]) ? $sltu_result : {31'b0,$src1_value[31]}) :
$is_sltu ? $sltu_result :
$is_srl ? $src1_value >> $src2_value[5:0] :
$is_sra ? ({{32{$src1_value[31]}}, $src1_value} >> $src2_value[4:0]) :
$lui ? ({$imm[31:12], 12'b0}) :
$auipc ? $pc + $imm :
$jal ? $pc + 4 :
$jalr ? $pc + 4 :
($is_load || $is_s_instr) ? $src1_value + $imm : 32'bx;
// Register File Write
$rf_wr_en = $valid ? ($rd == 5'b0) ? 1'b0 : $rd_valid : >>2$ld_data;
?$rf_wr_en
$rf_wr_index[4:0] = !$valid ? >>2$rd[4:0] : $rd[4:0];
$rf_wr_data[31:0] = !$valid ? >>2$ld_data[31:0] : $result[31:0];
//Branch Instructions
$taken_br = $is_beq ? ($src1_value == $src2_value) :
$is_bne ? ($src1_value != $src2_value) :
$is_blt ? (($src1_value < $src2_value) ^ ($src1_value[31] != $src2_value[31])) :
$is_bge ? (($src1_value >= $src2_value) ^ ($src1_value[31] != $src2_value[31])) :
$is_bltu ? ($src1_value < $src2_value) :
$is_bgeu ? ($src1_value >= $src2_value) : 1'b0;
$valid_taken_br = $valid && $taken_br;
// Load
$valid_load = $valid && $is_load;
$valid = !(>>1$valid_taken_br || >>2$valid_taken_br || >>1$valid_load || >>2$valid_load || >>1$valid_jump || >>2$valid_jump);
//Jump
$valid_jump = $valid && $is_jump;
@4
$dmem_rd_en = $valid_load;
$dmem_wr_en = $valid && $is_s_instr;
$dmem_addr[3:0] = $result[5:2];
$dmem_wr_data[31:0] = $src2_value[31:0];
@5
$ld_data[31:0] = $dmem_rd_data[31:0];
//`BOGUS_USE($is_beq $is_bne $is_blt $is_bge $is_bltu $is_bgeu)
// Note: Because of the magic we are using for visualisation, if visualisation is enabled below,
// be sure to avoid having unassigned signals (which you might be using for random inputs)
// other than those specifically expected in the labs. You'll get strange errors for these.
// Assert these to end simulation (before Makerchip cycle limit).
//*passed = *cyc_cnt > 40;
*passed = |cpu/xreg[15]>>5$value == (1+2+3+4+5+6+7+8+9);
*failed = 1'b0;
// Macro instantiations for:
// o instruction memory
// o register file
// o data memory
// o CPU visualization
|cpu
m4+imem(@1) // Args: (read stage)
m4+rf(@2, @3) // Args: (read stage, write stage) - if equal, no register bypass is required
m4+dmem(@4) // Args: (read/write stage)
m4+cpu_viz(@4) // For visualisation, argument should be at least equal to the last stage of CPU logic. @4 would work for all labs.
\SV
endmodule
sandpiper-saas -i ./src/module/rvmyth.tlv -o rvmyth.v --bestsv --noline -p verilog --outdir ./src/module/
iverilog -o output/RV_CPU.out src/module/RV_CPU_tb.v -I src/include -I src/module
cd output
./RV_CPU.out
gtkwave RV_CPU_tb.vcd
Here we will see detail waveforms of clk,reset,10 bit out signals out from sum of numbers from 1to 9.
Wave form of clk waveform of reset 10 bit out signals,sum of numbers from 1to 9 xreg[14] contents
Stimulation of verilog code in gtk wave:
Hence,On comparing above waveforms we can tell the result is same for code that stimulated using tlv is same as verilog code.
Installing yosys in linux:
$ git clone https://github.com/YosysHQ/yosys.git
$ cd yosys
$ sudo apt install make (If make is not installed)
$ sudo apt-get install build-essential clang bison flex \
libreadline-dev gawk tcl-dev libffi-dev git \
graphviz xdot pkg-config python3 libboost-system-dev \
libboost-python-dev libboost-filesystem-dev zlib1g-dev
$ make config-gcc
$ make
$ sudo make install
Verifying that yosys is installed
sudo apt-get install iverilog
Screenshot that iverilog installed
sudo apt update
sudo apt install gtkwave
Screenshot that gtkwave installed in linux
src/module - contains all RTL files and testbench.v used for simulating our BabySoC design src/include - contains RTL files used in `include define in main RTL files in src/module These files except the RV_CPU.v have been taken from reposatory, https://github.com/Subhasis-Sahu/BabySoC_Simulation
iverilog -o output/RV_CPU.out src/module/testbench.v -I src/include -I src/module
./RV_CPU.out
gtkwave dump.out
The output of the sum 1 to 9 can be observed