openhwgroup / cva6

The CORE-V CVA6 is an Application class 6-stage RISC-V CPU capable of booting Linux
https://docs.openhwgroup.org/projects/cva6-user-manual/
Other
2.22k stars 673 forks source link

Slow Dhrystone on FPGA #580

Closed manox closed 1 year ago

manox commented 3 years ago

I have tested systems generated with Chipyard on an FPGA (VCU118). With Rocket and Boom I also get plausible results here with Dhrystone and Coremark. However, with CVA6, the results for Dhrystone are relatively poor (~ 0.7 DMIPS/Mhz) and depend on the l2 cache (which is not the case with rocket/boom). Coremark is okay (> 2 Coremark/Mhz).

I'm happy for suggestions, thanks.

RaphaelKlink commented 3 years ago

Hey @manox,

i tried the same thing and got similar results. Cache dependency hints to a memory interface problem.

manox commented 3 years ago

Hi @RaphaelKlink, I know, it's me, Mark. ;)

Wcm926 commented 3 years ago

I have tested systems generated with Chipyard on an FPGA (VCU118). With Rocket and Boom I also get plausible results here with Dhrystone and Coremark. However, with CVA6, the results for Dhrystone are relatively poor (~ 0.7 DMIPS/Mhz) and depend on the l2 cache (which is not the case with rocket/boom). Coremark is okay (> 2 Coremark/Mhz).

I'm happy for suggestions, thanks.

Hi, I tried to test Coremark on an FPGA (genesys2), but I meet illegal instructions error when running Coremark. How did you solve this problem?

RaphaelKlink commented 3 years ago

We did not used the Ariane SoC but the Chipyard and OpenPiton Frameworks. Both of them are able to run coremark and dhrystone baremetal on the FPGA. As mentioned in this Issue the Chipyard has unusually low performance values.

Wcm926 commented 3 years ago

We did not used the Ariane SoC but the Chipyard and OpenPiton Frameworks. Both of them are able to run coremark and dhrystone baremetal on the FPGA. As mentioned in this Issue the Chipyard has unusually low performance values.

Hi, did you test systems generated with Chipyard on an FPGA (VCU118) and coremark is around 2 Coremark/Mhz

michael-etzkorn commented 3 years ago

@manox I'd be willing to corroborate your scores, but parameters retrieved by TestHarness aren't set when running the Config as

Cva6VCU118Config extends Config(
new WithVCU118Tweaks++
new chipyard.CVA6Config)

I assume you changed ExtMem in the CVA6 Chisel Module Implementation. What other changes did you do when generating the core?

Moschn commented 3 years ago

This is just a guess, but I believe Chipyard uses the write-through L1 cache variant of CVA6 (https://github.com/ucb-bar/cva6-wrapper/blob/139741a584d7e3c0446db592b5d99529bd6cf9fa/src/main/resources/vsrc/Makefile#L132). That explains why the performance depends on the L2 cache.

manox commented 3 years ago

@manox I'd be willing to corroborate your scores, but parameters retrieved by TestHarness aren't set when running the Config as

Cva6VCU118Config extends Config(
new WithVCU118Tweaks++
new chipyard.CVA6Config)

I assume you changed ExtMem in the CVA6 Chisel Module Implementation. What other changes did you do when generating the core?

I used a relatively old version of Chipyard in which the FPGA flow did not yet exist. I then brought this to the FPGA in my own Vivado project. Therefore I can not say anything about the mentioned config, sorry.

MikeOpenHWGroup commented 1 year ago

Hi @manox, @Moschn, @michael-etzkorn, @Wcm926 and @RaphaelKlink, thanks for your interest in CVA6. This issue has not been updated in ~1.5 years, so I will assume it is resolved and will close this issue. There is a related issue (#1035) which you can track. If you are still having trouble, please feel free to open another.