darklife / darkriscv

opensouce RISC-V cpu core implemented in Verilog from scratch in one night!
BSD 3-Clause "New" or "Revised" License
2.11k stars 285 forks source link

Why set two general-purpose register groups #46

Closed vgegok closed 2 years ago

vgegok commented 2 years ago

Why are two general-purpose register groups designed in darkriscv.v,

Not like one of the riscv specifications,

Is it to support multi-core?

reg [31:0] REG1 [0:31]; // general-purpose 32x32-bit registers (s1)
reg [31:0] REG2 [0:31]; // general-purpose 32x32-bit registers (s2)

I tried to use a general-purpose register set and it still works.

samsoniuk commented 2 years ago

oh, good point! it is a long history... the RISCV logical specification need handle 3 different registers at the same time: s1, s2 and d, in a way that the general construction for a sum is:

reg[d] = reg[s1]+reg[s2]

which means that the register set must be physically a triple-port memory. however, from the physical point of view, the register set is constructed with LUTRAM, which can handle only two access at the same time and the tool will do a "hidden" expansion in order to infer a triple-port memory, which means use a register set replica that uses 2x more LUTRAMs.

well, one solution to avoid the "hidden" expansion is read each operand in a separate clock:

tmp = reg[s1] reg[d] = tmp+reg[s2]

but it introduces a problem in the pipeline case the s1 and d are the same, the operation needs to be interlocked with the previous operation, which reduces the performance.

another solution is replicate the set:

reg1[d] = reg1[s1]+reg2[s2] reg2[d] = reg1[s1]+reg2[s2]

which ensures that the reg1 and reg2 are the same, but the reg1 and reg2 are both dual-port LUTRAMs with no "hidden" expansions which 1 register set appears physically as 2 registers set.

well, at this point there is not really much problem, since most tools infer this automatically. the problem is that sometimes I found a 3rd hidden replica!

lets say you have the following logic:

reg[d] = op ? reg[s1]+reg[s2] : reg[d]

which means add the s1+s2 registers when op is set, but do nothing when there is no op... but is this a triple-port memory or a quad-port memory? because we have one write in register d and three reads (registers s1, s2 and d). depending how the logic is constructed, the tool may infer a 3rd hidden replica:

reg1[d] = op ? reg1[s1]+reg2[s2] : reg3[d] reg2[d] = op ? reg1[s1]+reg2[s2] : reg3[d] (hidden expansion) reg3[d] = op ? reg1[s1]+reg2[s2] : reg3[d] (hidden expansion)

by changing the code to:

reg1[d] = op ? reg1[s1]+reg2[s2] : reg1[d] reg2[d] = op ? reg1[s1]+reg2[s2] : reg2[d]

the tool avoided the 3rd replica.

well, I think is possible eliminate the register replica in order to make the code smaller and clearer, but it depends of some additional tests with different tools, in order to ensure that there is no extra "hidden" logic and the timing is not affected.

I guess the problem started with the spartan-3 build and, after the register set replica was introduced, I made additional changes in the logic that may help the tool avoid the 3rd hidden replica.

so, I will check how the tool is handling the current code and maybe we can cut some useless lines of code here! :)

vgegok commented 2 years ago

Thank you for your answer. I'm studying your project. I've simplified the darkriscv-v code to a very small 300 lines (or even smaller) to clearly read and understand your design ideas. I have to admit that 300 lines realize a fully functional CPU, which is great.

In addition, the MAC command is not defined in the specification document and compiler. I don't know why I need to deal with it. Is it guided design the own instructions? Or something else?

In order to view the waveform of simulation conveniently, I have defined the operation of allx for instructions that do not need immediate extension. I wonder if this will lead to some problems? Like this:

                 IDATA[6:0]==`SCC ? { IDATA[31] ? ALL1[31:12]:ALL0[31:12], IDATA[31:25],IDATA[11:7] } : // s-type                                       20bits {imm[11:5]}{imm[4:0]}
                 IDATA[6:0]==`BCC ? { IDATA[31] ? ALL1[31:13]:ALL0[31:13], IDATA[31],IDATA[7],IDATA[30:25],IDATA[11:8],ALL0[0] } : // b-type            
                 IDATA[6:0]==`JAL ? { IDATA[31] ? ALL1[31:21]:ALL0[31:21], IDATA[31], IDATA[19:12], IDATA[20], IDATA[30:21], ALL0[0] } : // j-type
                 IDATA[6:0]==`LUI||
                 IDATA[6:0]==`AUIPC ? { IDATA[31:12], ALL0[11:0] } : // u-type
                 IDATA[6:0]==`JALR||
                 IDATA[6:0]==`LCC||
                 IDATA[6:0]==`MCC ? { IDATA[31] ? ALL1[31:12]:ALL0[31:12], IDATA[31:20] }: // i-type
                                    32'bx;

image

Forgive my English

samsoniuk commented 2 years ago

Thank you! I hope the darkriscv will help your research! :)

I have a pending task regarding turn the syntax more clear: one possibility is parse the ifdefs in a way is possible produce a Verilog output without ifdefs, which is far clearer. Another possibility is create separate files with separate cores (2-stage pipeline, 3-stage pipeline, 3-stage w/ threads, etc)... but I am not sure about the best approach.

Regarding the simulation, I think there is no problem, since X values does not really change anything in the physical design (the synthesis probably ignore it). One tip is try the synth with and without the change, to check if anything changes...but the X vallues are probably just ignored and results in no impact in the design.

Anyway, I did not like much about X values in the simulation, because they typically does appears when there is something really undefined or in conflict.

Finally, I use the MAC instruction mainly for DSP applications (16-bit audio processing), but it is not part of any official specification and it is intended to be encoded by hand. The format is the same as other R-type instructions, except by that the instruction field is fixed to 7'b1111111, as observed in the mac() function available in the stdio.c:

int mac(int acc,short x,short y)
{
#ifdef __RISCV__
    __asm__(".word 0x00c5857F"); // mac a0,a1,a2
    // "template"
    //acc += (x^y);
#else
    acc+=x*y;
#endif
    return acc;
}

The compiled code will be something like this, because the tools does not recognize the instruction pattern:

00000b2c <mac>:
 b2c:   857f                    0x857f
 b2e:   00c5                    addi    ra,ra,17
 b30:   00008067            ret

When called this way, the total time for a MAC will be 7 clocks, since the call/return pair will take 2 clock for call/return, plus 4 clocks regarding the pipeline flush (the darkriscv makes a prefetch of 2 instructions ahead).

It is far faster than the soft-implemented multiply, but a better result can be reached by encoding the instruction inline by hand, in a way that complex audio filters encoded by hand can reach 1 MAC/clock.

vgegok commented 2 years ago

I see, It is Wonderful!!! I have completed the migration on the minispartan6 plus development board and will submit the merger in the near future. After that, I will have more learning tasks, such as coremark test, adding GPIO, SPI, HDMI, ADUIO and other functions.

I like your project very much, including your code style. When I am interested in risc-v, I can also learn a lot about hardware design and computer architecture.

Thank you very much!

vgegok commented 2 years ago

Waiting for your processing of the general register part of the code, and I will close this issues.

samsoniuk commented 2 years ago

I made the test here and, although it affects the Artix-7 maximum speed, I liked the results for Kintx-7, so I will change according to your suggestion:

                     manual            automatic
kintex-7        238MHz  1953LUT     244MHz  1806LUT (wow!)
artix-7         189MHz  1922LUT     172MHz  1732LUT
spartan-6       96MHz   1922LUT      95MHz  1909LUT
spartan-3       55MHz   2212LUT      56MHz  2165LUT
vgegok commented 2 years ago

Some warnings:

··· VCD info: dumpfile darksocv.vcd opened for output. VCD warning: array word darksimv.soc0.core0.REGS[0] will conflict with an escaped identifier. VCD warning: array word darksimv.soc0.core0.REGS[1] will conflict with an escaped identifier. VCD warning: array word darksimv.soc0.core0.REGS[2] will conflict with an escaped identifier. VCD warning: array word darksimv.soc0.core0.REGS[3] will conflict with an escaped identifier. VCD warning: array word darksimv.soc0.core0.REGS[4] will conflict with an escaped identifier. VCD warning: array word darksimv.soc0.core0.REGS[5] will conflict with an escaped identifier. VCD warning: array word darksimv.soc0.core0.REGS[6] will conflict with an escaped identifier. VCD warning: array word darksimv.soc0.core0.REGS[7] will conflict with an escaped identifier. VCD warning: array word darksimv.soc0.core0.REGS[8] will conflict with an escaped identifier. VCD warning: array word darksimv.soc0.core0.REGS[9] will conflict with an escaped identifier. VCD warning: array word darksimv.soc0.core0.REGS[10] will conflict with an escaped identifier. VCD warning: array word darksimv.soc0.core0.REGS[11] will conflict with an escaped identifier. VCD warning: array word darksimv.soc0.core0.REGS[12] will conflict with an escaped identifier. VCD warning: array word darksimv.soc0.core0.REGS[13] will conflict with an escaped identifier. VCD warning: array word darksimv.soc0.core0.REGS[14] will conflict with an escaped identifier. VCD warning: array word darksimv.soc0.core0.REGS[15] will conflict with an escaped identifier. ···

ALL Log:

make -C src all
make[1]: Entering directory '/home/vgegok/darkriscv/src'
/opt/riscv32-gcc/bin/riscv32-unknown-elf-gcc -Wall -fcommon -ffreestanding -I./include -O2 -march=rv32e -mabi=ilp32e -D__RISCV__ -DBUILD="\"Fri, 20 May 2022 16:20:08 +0800\"" -DARCH="\"rv32e\"" -mcmodel=medany -mexplicit-relocs  -S main.c -o main.s
/opt/riscv32-gcc/bin/riscv32-unknown-elf-as -march=rv32e -c main.s -o main.o
/opt/riscv32-gcc/bin/riscv32-unknown-elf-gcc -Wall -fcommon -ffreestanding -I./include -O2 -march=rv32e -mabi=ilp32e -D__RISCV__ -DBUILD="\"Fri, 20 May 2022 16:20:08 +0800\"" -DARCH="\"rv32e\"" -mcmodel=medany -mexplicit-relocs  -S stdio.c -o stdio.s
/opt/riscv32-gcc/bin/riscv32-unknown-elf-as -march=rv32e -c stdio.s -o stdio.o
/opt/riscv32-gcc/bin/riscv32-unknown-elf-gcc -Wall -fcommon -ffreestanding -I./include -O2 -march=rv32e -mabi=ilp32e -D__RISCV__ -DBUILD="\"Fri, 20 May 2022 16:20:08 +0800\"" -DARCH="\"rv32e\"" -mcmodel=medany -mexplicit-relocs  -S io.c -o io.s
/opt/riscv32-gcc/bin/riscv32-unknown-elf-as -march=rv32e -c io.s -o io.o
/opt/riscv32-gcc/bin/riscv32-unknown-elf-gcc -Wall -fcommon -ffreestanding -I./include -O2 -march=rv32e -mabi=ilp32e -D__RISCV__ -DBUILD="\"Fri, 20 May 2022 16:20:08 +0800\"" -DARCH="\"rv32e\"" -mcmodel=medany -mexplicit-relocs  -S banner.c -o banner.s
/opt/riscv32-gcc/bin/riscv32-unknown-elf-as -march=rv32e -c banner.s -o banner.o
/opt/riscv32-gcc/bin/riscv32-unknown-elf-as -march=rv32e -c boot.s -o boot.o
/opt/riscv32-gcc/bin/riscv32-unknown-elf-cpp -P  darksocv.ld.src darksocv.ld
/opt/riscv32-gcc/bin/riscv32-unknown-elf-ld -Tdarksocv.ld -Map=darksocv.map -m elf32lriscv  main.o stdio.o io.o banner.o boot.o -o darksocv.o
/opt/riscv32-gcc/bin/riscv32-unknown-elf-objdump -d darksocv.o > darksocv.lst
/opt/riscv32-gcc/bin/riscv32-unknown-elf-objcopy -O binary  darksocv.o darksocv.bin
hexdump -ve '1/4 "%08x\n"' darksocv.bin > darksocv.mem
#xxd -p -c 4 -g 4 darksocv.o > darksocv.mem
rm darksocv.bin
wc -l darksocv.mem
1877 darksocv.mem
mem ok.
sources ok.
make[1]: Leaving directory '/home/vgegok/darkriscv/src'
make -C sim all
make[1]: Entering directory '/home/vgegok/darkriscv/sim'
iverilog -Wall -I ../rtl -o darksocv darksimv.v ../rtl/darksocv.v ../rtl/darkuart.v ../rtl/darkriscv.v
darksimv.v:36: warning: timescale for darksimv inherited from another file.
./../rtl/config.vh:31: ...: The inherited timescale is here.
../rtl/darksocv.v:34: warning: timescale for darksocv inherited from another file.
./../rtl/config.vh:31: ...: The inherited timescale is here.
../rtl/darksocv.v:716: warning: implicit definition of wire 'FINISH_REQ'.
../rtl/darksocv.v:768: warning: implicit definition of wire 'IDLE'.
../rtl/darkuart.v:74: warning: timescale for darkuart inherited from another file.
./../rtl/config.vh:31: ...: The inherited timescale is here.
../rtl/darkriscv.v:55: warning: timescale for darkriscv inherited from another file.
./../rtl/config.vh:31: ...: The inherited timescale is here.
./darksocv
WARNING: ../rtl/darksocv.v:250: $readmemh(../src/darksocv.mem): Not enough words in the file for the requested range [0:2047].
VCD info: dumpfile darksocv.vcd opened for output.
VCD warning: array word darksimv.soc0.core0.REGS[0] will conflict with an escaped identifier.
VCD warning: array word darksimv.soc0.core0.REGS[1] will conflict with an escaped identifier.
VCD warning: array word darksimv.soc0.core0.REGS[2] will conflict with an escaped identifier.
VCD warning: array word darksimv.soc0.core0.REGS[3] will conflict with an escaped identifier.
VCD warning: array word darksimv.soc0.core0.REGS[4] will conflict with an escaped identifier.
VCD warning: array word darksimv.soc0.core0.REGS[5] will conflict with an escaped identifier.
VCD warning: array word darksimv.soc0.core0.REGS[6] will conflict with an escaped identifier.
VCD warning: array word darksimv.soc0.core0.REGS[7] will conflict with an escaped identifier.
VCD warning: array word darksimv.soc0.core0.REGS[8] will conflict with an escaped identifier.
VCD warning: array word darksimv.soc0.core0.REGS[9] will conflict with an escaped identifier.
VCD warning: array word darksimv.soc0.core0.REGS[10] will conflict with an escaped identifier.
VCD warning: array word darksimv.soc0.core0.REGS[11] will conflict with an escaped identifier.
VCD warning: array word darksimv.soc0.core0.REGS[12] will conflict with an escaped identifier.
VCD warning: array word darksimv.soc0.core0.REGS[13] will conflict with an escaped identifier.
VCD warning: array word darksimv.soc0.core0.REGS[14] will conflict with an escaped identifier.
VCD warning: array word darksimv.soc0.core0.REGS[15] will conflict with an escaped identifier.
reset (startup)

              vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
                  vvvvvvvvvvvvvvvvvvvvvvvvvvvv
rrrrrrrrrrrrr       vvvvvvvvvvvvvvvvvvvvvvvvvv
rrrrrrrrrrrrrrrr      vvvvvvvvvvvvvvvvvvvvvvvv
rrrrrrrrrrrrrrrrrr    vvvvvvvvvvvvvvvvvvvvvvvv
rrrrrrrrrrrrrrrrrr    vvvvvvvvvvvvvvvvvvvvvvvv
rrrrrrrrrrrrrrrrrr    vvvvvvvvvvvvvvvvvvvvvvvv
rrrrrrrrrrrrrrrr      vvvvvvvvvvvvvvvvvvvvvv  
rrrrrrrrrrrrr       vvvvvvvvvvvvvvvvvvvvvv    
rr                vvvvvvvvvvvvvvvvvvvvvv      
rr            vvvvvvvvvvvvvvvvvvvvvvvv      rr
rrrr      vvvvvvvvvvvvvvvvvvvvvvvvvv      rrrr
rrrrrr      vvvvvvvvvvvvvvvvvvvvvv      rrrrrr
rrrrrrrr      vvvvvvvvvvvvvvvvvv      rrrrrrrr
rrrrrrrrrr      vvvvvvvvvvvvvv      rrrrrrrrrr
rrrrrrrrrrrr      vvvvvvvvvv      rrrrrrrrrrrr
rrrrrrrrrrrrrr      vvvvvv      rrrrrrrrrrrrrr
rrrrrrrrrrrrrrrr      vv      rrrrrrrrrrrrrrrr
rrrrrrrrrrrrrrrrrr          rrrrrrrrrrrrrrrrrr
rrrrrrrrrrrrrrrrrrrr      rrrrrrrrrrrrrrrrrrrr
rrrrrrrrrrrrrrrrrrrrrr  rrrrrrrrrrrrrrrrrrrrrr

       INSTRUCTION SETS WANT TO BE FREE

boot0: text@0 data@7504 stack@8192 (688 bytes free)
board: simulation only (id=0)
build: Fri, 20 May 2022 16:20:08 +0800 for rv32e
core0/thread0: darkriscv@100.0MHz rv32e
uart0: 115200 bps (div=868)
timr0: frequency=1000000Hz (io.timer=99)
mtvec: not found (polling only)

Welcome to DarkRISCV!
> the __INTERACTIVE__ option is disabled, ending simulation...
****************************************************************************
DarkRISCV Pipeline Report (41941 clocks):
core0: 70% running, 7% waiting (0% i-bus, 7% d-bus/rd, 0% d-bus/wr), 23% idle
****************************************************************************
simulation ok.
make[1]: Leaving directory '/home/vgegok/darkriscv/sim'
make -C boards all
make[1]: Entering directory '/home/vgegok/darkriscv/boards'
no board selected to build, done.
make[1]: Leaving directory '/home/vgegok/darkriscv/boards'
root@vgegok-Xilinx:/home/vgegok/darkriscv# 
samsoniuk commented 2 years ago

Some warnings are related to an old problem that is not possible show arrays in the gtkwave, but the problem was in fact related to iverilog. As long I learned how show the arrays (in special de register set), the warnings started... just in case, I tried find a way to disable the warnings and found how enable all them, so I hope I will clean all warnings in the future! :)

vgegok commented 2 years ago

I found a new probem: Makefile:

install:
+   make -C src all
    make -C boards install

config.vh:

 //`define __RV32E__
`define __HARVARD__
vgegok@vgegok-Xilinx:~/darkriscv-vgegok$ make install BOARD=scarab_minispartan6-plus_lx9 CROSS=riscv32-unknown-elf CCPATH=/opt/riscv32-gcc/bin ARCH=rv32i HARVARD=1
make -C src all
make[1]: Entering directory '/home/vgegok/darkriscv-vgegok/src'
/opt/riscv32-gcc/bin/riscv32-unknown-elf-gcc -Wall -fcommon -ffreestanding -I./include -O2 -march=rv32i -mabi=ilp32e -D__RISCV__ -DBUILD="\"Sat, 21 May 2022 17:41:12 +0800\"" -DARCH="\"rv32i\"" -mcmodel=medany -mexplicit-relocs  -S main.c -o main.s
/opt/riscv32-gcc/bin/riscv32-unknown-elf-as -march=rv32i -c main.s -o main.o
/opt/riscv32-gcc/bin/riscv32-unknown-elf-gcc -Wall -fcommon -ffreestanding -I./include -O2 -march=rv32i -mabi=ilp32e -D__RISCV__ -DBUILD="\"Sat, 21 May 2022 17:41:12 +0800\"" -DARCH="\"rv32i\"" -mcmodel=medany -mexplicit-relocs  -S stdio.c -o stdio.s
/opt/riscv32-gcc/bin/riscv32-unknown-elf-as -march=rv32i -c stdio.s -o stdio.o
/opt/riscv32-gcc/bin/riscv32-unknown-elf-gcc -Wall -fcommon -ffreestanding -I./include -O2 -march=rv32i -mabi=ilp32e -D__RISCV__ -DBUILD="\"Sat, 21 May 2022 17:41:12 +0800\"" -DARCH="\"rv32i\"" -mcmodel=medany -mexplicit-relocs  -S io.c -o io.s
/opt/riscv32-gcc/bin/riscv32-unknown-elf-as -march=rv32i -c io.s -o io.o
/opt/riscv32-gcc/bin/riscv32-unknown-elf-gcc -Wall -fcommon -ffreestanding -I./include -O2 -march=rv32i -mabi=ilp32e -D__RISCV__ -DBUILD="\"Sat, 21 May 2022 17:41:12 +0800\"" -DARCH="\"rv32i\"" -mcmodel=medany -mexplicit-relocs  -S banner.c -o banner.s
/opt/riscv32-gcc/bin/riscv32-unknown-elf-as -march=rv32i -c banner.s -o banner.o
/opt/riscv32-gcc/bin/riscv32-unknown-elf-as -march=rv32i -c boot.s -o boot.o
/opt/riscv32-gcc/bin/riscv32-unknown-elf-cpp -P  -DHARVARD=1 darksocv.ld.src darksocv.ld
/opt/riscv32-gcc/bin/riscv32-unknown-elf-ld -Tdarksocv.ld -Map=darksocv.map -m elf32lriscv  main.o stdio.o io.o banner.o boot.o -o darksocv.o
/opt/riscv32-gcc/bin/riscv32-unknown-elf-ld: darksocv.o section `.text.startup' will not fit in region `ROM'
/opt/riscv32-gcc/bin/riscv32-unknown-elf-ld: section .data LMA [0000000000001000,0000000000001677] overlaps section .text.startup LMA [0000000000000fa8,00000000000016d3]
/opt/riscv32-gcc/bin/riscv32-unknown-elf-ld: region `ROM' overflowed by 1748 bytes
Makefile:113: recipe for target 'darksocv.o' failed
make[1]: *** [darksocv.o] Error 1
make[1]: Leaving directory '/home/vgegok/darkriscv-vgegok/src'
Makefile:50: recipe for target 'install' failed
make: *** [install] Error 2

Does RV32I not support HARVARD?

vgegok commented 2 years ago

I Try again, I must to make install, Than make install HARVARD=1.

But the Bitstream can not running.

vgegok commented 2 years ago

It is Working!!! 👇👇👇👇👇👇👇👇👇👇👇👇👇👇👇👇👇👇👇👇

darksoc.ld.src:

MEMORY
{
    IO  (rw!x) : ORIGIN = 0x80000000, LENGTH = 0x10
#if HARVARD
    ROM (x!rw) : ORIGIN = 0x00000000, LENGTH = 0x2000
    RAM (rw!x) : ORIGIN = 0x00002000, LENGTH = 0x2000
#else
    MEM (rwx)  : ORIGIN = 0x00000000, LENGTH = 0x4000
#endif

darksocv.v:

`ifdef __HARVARD__

    reg [31:0] ROM [0:2047]; // ro memory
    reg [31:0] RAM [0:2047]; // rw memory

    // memory initialization

    integer i;
    initial
    begin
        for(i=0;i!=2048;i=i+1)
        begin        
            ROM[i] = 32'd0;
            RAM[i] = 32'd0;
        end

        // workaround for vivado: no path in simulation and .mem extension

`ifdef XILINX_SIMULATOR
        $readmemh("darksocv.rom.mem",ROM);        
        $readmemh("darksocv.ram.mem",RAM);
`else
        $readmemh("../src/darksocv.rom.mem",ROM);        
        $readmemh("../src/darksocv.ram.mem",RAM);
`endif        
    end

ROM[IADDR[11:2]]--->ROM[IADDR[12:2]] RAM[DADDR[11:2]]--->RAM[DADDR[12:2]]

MEM[IADDR[12:2]]--->MEM[IADDR[13:2]] MEM[DADDR[12:2]]--->MEM[DADDR[13:2]]

samsoniuk commented 2 years ago

The core is always working with harvard architecture, but it is possible build the soc w/ a pure harvard architecture (separate ROM/RAM memories w/ the option HARVARD active) or build the soc w/ a hybrid harvard/von neumann architecture. This is done by connecting the instruction and data buses in different buses of the same BRAM, in a way that the instruction and data buses can work in parallel, as defined by the harvard architecture, but it has a unified address space, as defined by the von neumann architecture.

As long the firmware is composed by ~5KB of code and ~1.5KB of data, the hybrid memory model can easily fit both code and data in less than 8KB. However, in the pure harvard memory model we must create a 8KB memory for code and at least 2KB memory for data... to make it more simple, I just defined the same size for both, in a way that a pair of 8KB+8KB memories are created, via the option MLEN in the config.vh.

In the case of the firmware, there is a HARVARD option in the Makefile but unfortunately there is no much integration with the hardware options, so the option must be enabled by hand. Although is possible automate this integration (by parsing the config.vh in the Maketile), I am not much sure that the pure harvard model have any advantage over the hybrid model, so I will keep the hybrid memory model as default.

oh, I found a small bug in the src/Makefile, which explain why the make does not worked directly. I also cleaned the simulation warnings!

vgegok commented 2 years ago

I saw your modification. It's great! Looks like you're still working at night——dark:)

vgegok commented 2 years ago

I made the test here and, although it affects the Artix-7 maximum speed, I liked the results for Kintx-7, so I will change according to your suggestion:

                     manual            automatic
kintex-7        238MHz  1953LUT     244MHz  1806LUT (wow!)
artix-7         189MHz  1922LUT     172MHz  1732LUT
spartan-6       96MHz   1922LUT      95MHz  1909LUT
spartan-3       55MHz   2212LUT      56MHz  2165LUT

I wonder how you got these scores? CoreMark? Can you share the source code of your application?

samsoniuk commented 2 years ago

Those results are from the Xilinx ISE 14.7 synthesis report:

$ egrep "(LUTs|Frequency|Target)" darksocv.syr 
---- Target Parameters
Target Device                      : xc7k420t-2-ffg901
---- Target Options
 Number of Slice LUTs:                 1633  out of  260600     0%  
   Minimum period: 4.197ns (Maximum Frequency: 238.276MHz)

So, the above results are the frequency and area used in the FPGA (I just get a random report here, not sure it is the latest version in the github).

The effective performance in MIPS depends of the application: in theory it is possible peaks an IPC = 1 (instruction per clock), but it requires optimize the code by hand in order to unroll all loops and minimize the use of load/store instructions and/or use LUTRAM for data, in order to avoid wait-states (the BRAM requires 1 wait-state in the load instruction).

The default test firmware, compiled with GCC -O2, prints a detailed pipeline report at the end of simulation:

****************************************************************************
DarkRISCV Pipeline Report (45485 clocks):
core0: 71% run, 6% wait (0% i-bus, 6% d-bus/rd, 0% d-bus/wr), 23% idle
****************************************************************************

In the report is possible see who much clocks the simulation used, who much of them are lost to wait-states and pipeline flushes. In the above case, we lost 23% of the clocks to pipeline flushes, 6% to wait-states and the result is that the processor really execute useful instructions only 71% of time, which results in a IPC = 0.71. Supposing the best frequency score of 244MHz, we have 244 x 0.71 = 173MIPS.

Although the this is not a real benchmark, it is a real application, so the empirical result is good enough to check how the core is spending the clocks in the pipeline.

vgegok commented 2 years ago

I made some updates:

  1. Add io.timeus used to provide system time, similar to clock_ t clock()
  2. Migrate coremark
  3. Independent allocation of MLEN in Harvard and von Neumann architecture
  4. modify the readme and some makefile contents
  5. ........

You can evaluate whether the pull request can be submitted. https://github.com/vgegok/darkriscv

samsoniuk commented 2 years ago

wow! I will open a new issue regarding the coremark!

samsoniuk commented 2 years ago

I opened an issue #52, so new comments, suggestions and general posts can start there and moved to different issues as long they switches from suggestions to real implementation.