riscv / riscv-control-transfer-records

This repo contains a RISC-V ISA extension (proposal) to allow recording of control transfer history to on-chip registers, to support usages associated with profiling and debug.
https://jira.riscv.org/browse/RVG-62
Creative Commons Attribution 4.0 International
14 stars 4 forks source link

[Question] Inquiry about software status on CTR extension #26

Open gaoyichuan opened 2 months ago

gaoyichuan commented 2 months ago

Background

I am currently involved in a project aiming to support the CTR extension on some RISC-V processor RTL. Given the significance of software in testing and debugging process for CTR, I am reaching out to find out the current status and future roadmap of software support related to CTR.

Specific Questions

  1. General: Does TG have plan on how CTR will be integrated into Linux perf framework? Judging from current design for PMU, I think configure and control for CTR would be done through SBI via a new extension, and read data would be done via CSR read by kernel itself. Am I understanding this correct?
  2. SBI: Are there any ongoing discussion or development on SBI side of things, such as specifications for a new extension?
  3. Kernel: Will the CTR be supported inside of riscv-pmu-sbi driver, or in arch/riscv as platform related code (which x86 LBR take this route)?

Thanks for your time on answering these questions. I can also help on the SW side of things if needed.

rajnesh-kanwal commented 2 months ago

General: Does TG have plan on how CTR will be integrated into Linux perf framework? Judging from current design for PMU, I think configure and control for CTR would be done through SBI via a new extension, and read data would be done via CSR read by kernel itself. Am I understanding this correct?

In the current PoC we delegate CTR to Smode and kernel itself handles everything related to CTR. Mmode/sbi is not involved other than programming events. Recent https://lore.kernel.org/lkml/20240217005738.3744121-1-atishp@rivosinc.com/ change adds support for counter delegation which further eliminates the need to program counters using sbi.

SBI: Are there any ongoing discussion or development on SBI side of things, such as specifications for a new extension?

As pointed above, it's not needed. Kernel has access to all csrs needed to program CTR.

Kernel: Will the CTR be supported inside of riscv-pmu-sbi driver, or in arch/riscv as platform related code (which x86 LBR take this route)?

In PoC kernel driver at 0.8.3v I had a separate file riscv_ctr.c under driver/perf/ similar to what arm does.

I am just cleaning up some changes and doing some fixes in qemu before I can sent 1.0v patches upstream.

gaoyichuan commented 2 months ago

General: Does TG have plan on how CTR will be integrated into Linux perf framework? Judging from current design for PMU, I think configure and control for CTR would be done through SBI via a new extension, and read data would be done via CSR read by kernel itself. Am I understanding this correct?

In the current PoC we delegate CTR to Smode and kernel itself handles everything related to CTR. Mmode/sbi is not involved other than programming events. Recent https://lore.kernel.org/lkml/20240217005738.3744121-1-atishp@rivosinc.com/ change adds support for counter delegation which further eliminates the need to program counters using sbi.

SBI: Are there any ongoing discussion or development on SBI side of things, such as specifications for a new extension?

As pointed above, it's not needed. Kernel has access to all csrs needed to program CTR.

I see, thanks for the clarification. Since this will make CTR (implicitly) depends on Smcdeleg / Ssccfg and their dependencies, will this be mentioned in spec document? This could be helpful when designing RTL, for knowing what extensions are needed before testing.

Kernel: Will the CTR be supported inside of riscv-pmu-sbi driver, or in arch/riscv as platform related code (which x86 LBR take this route)?

In PoC kernel driver at 0.8.3v I had a separate file riscv_ctr.c under driver/perf/ similar to what arm does.

I am just cleaning up some changes and doing some fixes in qemu before I can sent 1.0v patches upstream.

Glad to hear there exists PoC driver and QEMU. Will these code be open-sourced soon (or I just didn't find them)? PoC software quality is good enough for initial RTL testing and debugging in hardware side.

rajnesh-kanwal commented 2 months ago

General: Does TG have plan on how CTR will be integrated into Linux perf framework? Judging from current design for PMU, I think configure and control for CTR would be done through SBI via a new extension, and read data would be done via CSR read by kernel itself. Am I understanding this correct?

In the current PoC we delegate CTR to Smode and kernel itself handles everything related to CTR. Mmode/sbi is not involved other than programming events. Recent https://lore.kernel.org/lkml/20240217005738.3744121-1-atishp@rivosinc.com/ change adds support for counter delegation which further eliminates the need to program counters using sbi.

SBI: Are there any ongoing discussion or development on SBI side of things, such as specifications for a new extension?

As pointed above, it's not needed. Kernel has access to all csrs needed to program CTR.

I see, thanks for the clarification. Since this will make CTR (implicitly) depends on Smcdeleg / Ssccfg and their dependencies, will this be mentioned in spec document? This could be helpful when designing RTL, for knowing what extensions are needed before testing.

CTR extension itself doesn't require Smcdeleg and Ssccfg neither the PoC. CTR's kernel implementation just requires the counters to be programmed so that we can collect sample using LCOFI Irq. That counter programming part can be done by SBI interface or if Smcdeleg and Ssccfg are present then kernel can do that itself. @bcstrongx correct me if I am wrong.

Kernel: Will the CTR be supported inside of riscv-pmu-sbi driver, or in arch/riscv as platform related code (which x86 LBR take this route)?

In PoC kernel driver at 0.8.3v I had a separate file riscv_ctr.c under driver/perf/ similar to what arm does. I am just cleaning up some changes and doing some fixes in qemu before I can sent 1.0v patches upstream.

Glad to hear there exists PoC driver and QEMU. Will these code be open-sourced soon (or I just didn't find them)? PoC software quality is good enough for initial RTL testing and debugging in hardware side.

So I am done with Qemu and kernel patches. I will drop a link here when I upstream those (hopefully within 2-3 days.) Just testing some corner cases. For spike, we have v0.5.3 of CTR implemented. Here is the change. https://github.com/rivosinc/riscv-isa-sim/commits/dev/rkanwal/ctr_support/ Expect some delay there for v1.0.

rajnesh-kanwal commented 1 month ago

Here are the repositories with control transfer records support. Note that these also contain patches to enable support for smcdeleg, smcntrpmf, ssccfg, smcofpmf, smcsrind and sscsrind. Make sure to enable these when running qemu.

https://github.com/rajnesh-kanwal/qemu/tree/ctr_upstream https://github.com/rajnesh-kanwal/linux/tree/ctr_upstream https://github.com/rajnesh-kanwal/opensbi/tree/ctr_upstream

You will also need to compile perf and copy it to your rootfs. I have been using https://github.com/carlosedp/riscv-bringup/blob/master/Ubuntu-Rootfs-Guide.md to create a rootfs. While compiling perf you may find that lots of packages are missing, make sure to install those dependencies in rootfs. You can do that by first doing chroot and then using apt install to install those packages. I use following command to compile perf.

PKG_CONFIG_LIBDIR=/<path to mounted rootfs>/usr/lib/riscv64-linux-gnu/pkgconfig/ VF=1 make   EXTRA_CFLAGS="--sysroot=/<path to mounted rootfs>/"   ARCH=riscv  CROSS_COMPILE=riscv64-linux-gnu- NO_LIBBPF=1

Qemu run command:

./build/qemu-system-riscv64 -M virt,aia=aplic-imsic,aia-guests=5 -cpu rv64,smaia=true,ssaia=true,smcdeleg=true,ssccfg=true,smcntrpmf=true,sscofpmf=true,sscsrind=true,smcsrind=true,smctr=true,ssctr=true  -icount auto -m 8192 -nographic -kernel /path-to-kernel-build/arch/riscv/boot/Image -append "root=/dev/vda  rw console=ttyS0 earlycon=sbi" -drive file=/path-to-rootfs/rootfs.ext4,format=raw,id=hd0 -device virtio-blk-pci,drive=hd0  -netdev user,id=usernet,hostfwd=tcp:127.0.0.1:7722-0.0.0.0:22 -device e1000e,netdev=usernet

Sample code for basic testing:

#include <sys/syscall.h>

#define N 1000000

void f2(void)
{
}

void f3(void)
{
}

void f1(unsigned long n)
{
        if (n & 1UL)
                f2();
        else
                f3();
}

int _start(void)
{
        unsigned long i = 0;
        for (i=0; i < N; i++)
                f1(i);

        asm volatile ("li a7, 93\n"
                      "ecall\n"
                      );
        __builtin_unreachable();
}

Cmd to compile above code:

riscv64-unknown-linux-gnu-gcc -nostdlib test.c

Perf run commands:

./perf record -e instructions:ppu -c 10000  -b ./a.out
./perf report  --stdio
gaoyichuan commented 1 month ago

Thanks a lot for your help! I'll checkout all the patches

rajnesh-kanwal commented 1 month ago

Here is a quick Wiki to run CTR basic demo. https://github.com/rajnesh-kanwal/linux/wiki/Running-CTR-basic-demo-on-QEMU-RISC%E2%80%90V-Virt-machine