ultraembedded / biriscv

32-bit Superscalar RISC-V CPU
Apache License 2.0
838 stars 146 forks source link

Benchmark scores #11

Closed kuopinghsu closed 3 years ago

kuopinghsu commented 3 years ago

To run dhrystone and coremark, it needs to exceed 64Kb. I modified the code with 128Kb TCM in the following path, and added env to run it.

https://github.com/kuopinghsu/biriscv

I got the following benchmark scores,

Coremark: In memory: CoreMark/MHz: 3.333047 In TCM: CoreMark/MHz: 3.345606

Benchmark: In memory: DMIPS_Per_MHz: 2.243 In TCM: DMIPS_Per_MHz: 2.415

I can't get a score of 4.1 CoreMark/MHz mentioned by the biRISC-V core. Could you share how to get 4.1 CoreMark/MHz score?

ultraembedded commented 3 years ago

Hi,

I used the same GCC version and compiler flags as SiFive did to get their CoreMark scores (so I had a fair comparison). I remember that it was somewhat compiler version specific at the time. I’ll try and dig out the details for you.

I also used the ‘default’ configuration for biRISC-V: https://github.com/ultraembedded/biriscv/blob/master/docs/configuration.md

ultraembedded commented 3 years ago

I think the compiler flags (that I got from some SiFive repos) were:

CFLAGS+=-O2 -fno-common -funroll-loops -finline-functions -falign-functions=16 -falign-jumps=4 -falign-loops=4 -finline-limit=1000 -fno-if-conversion2 -fselective-scheduling -fno-tree-dominator-opts -fno-reg-struct-return -fno-rename-registers --param case-values-threshold=8 -fno-crossjumping -freorder-blocks-and-partition -fno-tree-loop-if-convert -fno-tree-sink -fgcse-sm -fno-strict-overflow

kuopinghsu commented 3 years ago

Thanks for your quick reply. This is the update result of running coremark in TCM. I got CoreMark/MHz: 3.666170.

        SystemC 2.3.3-Accellera --- May 18 2021 20:21:23
        Copyright (c) 1996-2018 by all Contributors,
        ALL RIGHTS RESERVED
Running: ../../../sw/coremark/coremark.elf
Memory: 0x0 - 0x1d26f (Size=116KB) [.text]
Memory: 0x1d270 - 0x1d2ab (Size=0KB) [.eh_frame]
Memory: 0x1d2b0 - 0x1f193 (Size=7KB) [.data]
Memory: 0x1f194 - 0x231e7 (Size=16KB) [.bss]

Info: (I702) default timescale unit used for tracing: 1 ns (sysc_wave.vcd)
2K performance run parameters for coremark.
CoreMark Size    : 666
Total ticks      : 1091057
Total time (secs): 0.010911
Iterations/Sec   : 366.616960
Iterations       : 4
Compiler version : GCC10.2.0
Compiler flags   : -O2 -march=rv32im -mabi=ilp32 -nostartfiles -nostdlib -L../common -DPERFORMANCE_RUN=1 -fno-common -funroll-loops -finline-functions -falign-functions=16 -falign-jumps=4 -falign-loops=4 -finline-limit=1000 -fno-if-conversion2 -fselective-scheduling -fno-tree-dominator-opts -fno-reg-struct-return -fno-rename-registers --param case-values-threshold=8 -fno-crossjumping -freorder-blocks-and-partition -fno-tree-loop-if-convert -fno-tree-sink -fgcse-sm -fno-strict-overflow   -lc -lm -lgcc -lsys -T ../common/tcm.ld
Memory location  : STACK
seedcrc          : 0xe9f5
[0]crclist       : 0xe714
[0]crcmatrix     : 0x1fd7
[0]crcstate      : 0x8e3a
[0]crcfinal      : 0x9f95
Correct operation validated. See README.md for run and reporting rules.
CoreMark 1.0 : 366.616960 / GCC10.2.0 -O2 -march=rv32im -mabi=ilp32 -nostartfiles -nostdlib -L../common -DPERFORMANCE_RUN=1 -fno-common -funroll-loops -finline-functions -falign-functions=16 -falign-jumps=4 -falign-loops=4 -finline-limit=1000 -fno-if-conversion2 -fselective-scheduling -fno-tree-dominator-opts -fno-reg-struct-return -fno-rename-registers --param case-values-threshold=8 -fno-crossjumping -freorder-blocks-and-partition -fno-tree-loop-if-convert -fno-tree-sink -fgcse-sm -fno-strict-overflow   -lc -lm -lgcc -lsys -T ../common/tcm.ld / STACK
CoreMark/MHz: 3.666170
TB: Aborted at 13475360 ns
ultraembedded commented 3 years ago

I think it is likely that I used gcc version 7.2.0 (as that is what I have installed). It could be that I used a different version at the time.

kuopinghsu commented 3 years ago

I checked gcc version 7.2.0. If it runs in memory, I get 4.059887 CoreMark/MHz, if it runs in TCM, I get 4.142030 CoreMark/MHz. This matches your results. Thanks lot.

        SystemC 2.3.3-Accellera --- May 18 2021 20:21:23
        Copyright (c) 1996-2018 by all Contributors,
        ALL RIGHTS RESERVED
Running: ../../../sw/coremark/coremark.elf
Memory: 0x80000000 - 0x8001d55b (Size=117KB) [.text]
Memory: 0x8001d55c - 0x8001d597 (Size=0KB) [.eh_frame]
Memory: 0x8001d598 - 0x8001f47b (Size=7KB) [.data]
Memory: 0x8001f47c - 0x800234cf (Size=16KB) [.bss]

Info: (I702) default timescale unit used for tracing: 1 ns (sysc_wave.vcd)
2K performance run parameters for coremark.
CoreMark Size    : 666
Total ticks      : 985249
Total time (secs): 0.009852
Iterations/Sec   : 405.988740
Iterations       : 4
Compiler version : GCC7.2.0
Compiler flags   : -O2 -march=rv32im -mabi=ilp32 -nostartfiles -nostdlib -L../common -DPERFORMANCE_RUN=1 -fno-common -funroll-loops -finline-functions -falign-functions=16 -falign-jumps=4 -falign-loops=4 -finline-limit=1000 -fno-if-conversion2 -fselective-scheduling -fno-tree-dominator-opts -fno-reg-struct-return -fno-rename-registers --param case-values-threshold=8 -fno-crossjumping -freorder-blocks-and-partition -fno-tree-loop-if-convert -fno-tree-sink -fgcse-sm -fno-strict-overflow   -lc -lm -lgcc -lsys -T ../common/default.ld
Memory location  : STACK
seedcrc          : 0xe9f5
[0]crclist       : 0xe714
[0]crcmatrix     : 0x1fd7
[0]crcstate      : 0x8e3a
[0]crcfinal      : 0x9f95
Correct operation validated. See README.md for run and reporting rules.
CoreMark 1.0 : 405.988740 / GCC7.2.0 -O2 -march=rv32im -mabi=ilp32 -nostartfiles -nostdlib -L../common -DPERFORMANCE_RUN=1 -fno-common -funroll-loops -finline-functions -falign-functions=16 -falign-jumps=4 -falign-loops=4 -finline-limit=1000 -fno-if-conversion2 -fselective-scheduling -fno-tree-dominator-opts -fno-reg-struct-return -fno-rename-registers --param case-values-threshold=8 -fno-crossjumping -freorder-blocks-and-partition -fno-tree-loop-if-convert -fno-tree-sink -fgcse-sm -fno-strict-overflow   -lc -lm -lgcc -lsys -T ../common/default.ld / STACK
CoreMark/MHz: 4.059887
TB: Aborted at 13058020 ns
        SystemC 2.3.3-Accellera --- May 18 2021 20:21:23
        Copyright (c) 1996-2018 by all Contributors,
        ALL RIGHTS RESERVED
Running: ../../../sw/coremark/coremark.elf
Memory: 0x0 - 0x1d55b (Size=117KB) [.text]
Memory: 0x1d55c - 0x1d597 (Size=0KB) [.eh_frame]
Memory: 0x1d598 - 0x1f473 (Size=7KB) [.data]
Memory: 0x1f474 - 0x234c7 (Size=16KB) [.bss]

Info: (I702) default timescale unit used for tracing: 1 ns (sysc_wave.vcd)
2K performance run parameters for coremark.
CoreMark Size    : 666
Total ticks      : 965710
Total time (secs): 0.009657
Iterations/Sec   : 414.203022
Iterations       : 4
Compiler version : GCC7.2.0
Compiler flags   : -O2 -march=rv32im -mabi=ilp32 -nostartfiles -nostdlib -L../common -DPERFORMANCE_RUN=1 -fno-common -funroll-loops -finline-functions -falign-functions=16 -falign-jumps=4 -falign-loops=4 -finline-limit=1000 -fno-if-conversion2 -fselective-scheduling -fno-tree-dominator-opts -fno-reg-struct-return -fno-rename-registers --param case-values-threshold=8 -fno-crossjumping -freorder-blocks-and-partition -fno-tree-loop-if-convert -fno-tree-sink -fgcse-sm -fno-strict-overflow   -lc -lm -lgcc -lsys -T ../common/tcm.ld
Memory location  : STACK
seedcrc          : 0xe9f5
[0]crclist       : 0xe714
[0]crcmatrix     : 0x1fd7
[0]crcstate      : 0x8e3a
[0]crcfinal      : 0x9f95
Correct operation validated. See README.md for run and reporting rules.
CoreMark 1.0 : 414.203022 / GCC7.2.0 -O2 -march=rv32im -mabi=ilp32 -nostartfiles -nostdlib -L../common -DPERFORMANCE_RUN=1 -fno-common -funroll-loops -finline-functions -falign-functions=16 -falign-jumps=4 -falign-loops=4 -finline-limit=1000 -fno-if-conversion2 -fselective-scheduling -fno-tree-dominator-opts -fno-reg-struct-return -fno-rename-registers --param case-values-threshold=8 -fno-crossjumping -freorder-blocks-and-partition -fno-tree-loop-if-convert -fno-tree-sink -fgcse-sm -fno-strict-overflow   -lc -lm -lgcc -lsys -T ../common/tcm.ld / STACK
CoreMark/MHz: 4.142030
TB: Aborted at 12255460 ns
ultraembedded commented 3 years ago

Ok, good! (Closing the issue now).