Closed amr-25 closed 1 year ago
Hi,
In which context are you reading those, in linux user mode ?
In user mode you can use : ucycle instead of mcycle. That one should be good.
Hoooo, just looking right now, ucycle is turned off to save area XD
See ucycleAccess = CsrAccess.NONE, in : https://github.com/SpinalHDL/VexRiscv/blob/24795ef09b88defe2ee1bb335e5caaf7e07e64ff/src/main/scala/vexriscv/plugin/CsrPlugin.scala#L110
Should have been at least READ_ONLY
So, to turn it one, you can go in the litex pythondata-cpu-vexriscv_smp go the the inner VexRiscv repo (pythondata_cpu_vexriscv_smp/verilog/ext/VexRiscv), patch it there, and delete the pregenerated netlist in pythondata_cpu_vexriscv_smp/verilog.
You will neet to have SBT installed
My objective is to measure cycles spent on a benchmark when configured with one core, and two core.
So with the pythondata-cpu-vexriscv_smp modification i posted just above that should be good.
But overall, if the runtime of the benchmark is long enough, i guess the overhead of using the standard c library time is fine.
Alright, the following modification will change the hardware aspect but does it not require changes in sbi side? Re-generating the opensbi.bin?
It will only change hardware. Software can stay the same.
I deleted .v files inside pythondata_cpu_vexriscv_smp/verilog. But it is asking for one of the files while I make the smp. The error is ERROR: [Common 17-69] Command failed: File '/home/user/Litex/pythondata-cpu-vexriscv-smp/pythondata_cpu_vexriscv_smp/verilog/Ram_1w_1rs_Generic.v' does not exist
Hoo my bad, do not delete the Ram*.v verilog, i forgot about those
Thank you. It worked :) Similarly, are there other counters(instructions retired, etc) that can be enabled to get more data on performance?
Hooo nice :D
Sure, there is uinstretAccess for the instruction retired that you can enable. Eventualy there is also utimeAccess but it is quite similar to ucycle, so not realy usefull.
I found this project https://github.com/firesim/firesim/blob/main/sim/firesim-lib/src/main/scala/bridges/TracerVBridge.scala which lets to profile and access the performance by tracing. Are there plans to add something similar to evaluate performance other than the 3 counters(rdcycle,time and instret)? (Similarly in riscy cores https://github.com/hchsiao/riscv/blob/master/riscv_tracer.sv) It would be nice to have it with Vexriscv as well. Thank you
Hi,
So tracing things in simulation ? or tracing things on real hardware ? You want to trace the flow of instruction and events in the pipeline ?
(currently there is no plan)
Yes, tracing in cycle accurate simulation or hardware. I guess, verilator can do the tracing in simulation? Not sure. My thoughts are like this, something like a plugin which is triggered by an instruction(similar to rdcycle, for instance, startrec), to start recording the instructions retiring and stop recording through another instruction.
something like a plugin which is triggered by an instruction(similar to rdcycle, for instance, startrec), to start recording the instructions retiring and stop recording through another instruction.
Ahh this sound like the RISC-V privileged performance counters. That could do it, (not implemented yet)
So, creation of such plugin is feasible through modifying the customcsrplugin? If yes, I can look into it. Not sure if it can be realised with verilator itself, aswell.
@amr-25 Yes, https://github.com/SpinalHDL/VexRiscv/blob/051d140c33ce1480e10bdf76668fceae8ff59bef/src/main/scala/vexriscv/demo/CustomCsrDemoPlugin.scala#L11 is not very far from it ^^
The RISC-V feature is specified in https://github.com/riscv/riscv-isa-manual/releases/download/Priv-v1.12/riscv-privileged-20211203.pdf
in 3.1.10 Hardware Performance Monitor
They are the hpmcounters and related hardware. With the hpmeventX registers you can specify which kind of hardware event make the hpmcounterX count up, ex : instruction retire, cache miss and so on.
Not sure if it can be realised with verilator itself, aswell.
It isn't realy related to verilator, the idea is to be able to access those counter directly from the software running on the CPU itself :)
So could be used in any simulation / hardware
Right. Sorry if I have misunderstood. But what if we simulate the CPU in verilator with TRACE on. And extract the information(ex:instruction retired) about the instructions from the .fst that is generated. I was going through https://tomverbeure.github.io/2022/02/20/GDBWave-Post-Simulation-RISCV-SW-Debugging.html where the two signals i.e lastStagePc[31:0] and lastStageIsValid is used to get the number of instructions retired. Through this way, we need not have a counter that can be accessed by the software but just probe the signals and get the information of interest. Won't both of these ways, i.e having counter and verilator sim lead to same purpose?
Yes right, that's one way to do it ^^ I would say, the fst trace wave is good, until you need real hardware to interract with real peripherals, or need to run very long stuff like booting linux. But else, yes, i think things are mostly interchangeable. Maybe the only thing you kind of need to analyse things in the FST wave, is a way to know when you want to start count / stop count.
I'm trying to measure the clock cycles through counters present. But I cannot use rdcycles or mcycle due to the presence of sbi supervisor. I get sbi_trap_error, mcause=2 if I try to use asm volatile ("csrr %0, mcycle" : "=r" (b));. Can I get some ideas on how to measure performance? Thank you