Open LMiaoH opened 11 months ago
As far as I can tell this is a problem with spike or pk not exposing the cycle
csr.
I'm not sure whats going on though, as pk
enables it here:
If you remove the __asm
statement in rv_cycle
and remove _zfh_zba_zbb_zbs
from config.mk
or add them to the spike isa string, it runs for me.
Thanks. Unfortunately, trying these suggestions doesn't seem to have any effect, and the same issue persists.
What should be my next move or consideration in this situation?
Here is a Dockerfile that demonstrates what I was referring to:
FROM ubuntu:23.04
RUN apt-get update \
&& apt-get install -y build-essential wget git gcc-riscv64-linux-gnu clang device-tree-compiler lld \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
RUN git clone --depth=1 https://github.com/riscv-software-src/riscv-isa-sim \
&& cd riscv-isa-sim \
&& ./configure \
&& make -j $(nproc) \
&& make install \
&& cd .. \
&& rm -rf riscv-isa-sim
RUN git clone --depth=1 https://github.com/riscv-software-src/riscv-pk \
&& mkdir riscv-pk/build \
&& cd riscv-pk/build \
&& ../configure --host=riscv64-linux-gnu --with-arch=rv64gcv --with-abi=lp64d \
&& make -j $(nproc) \
&& make install \
&& cd ../.. \
&& rm -rf riscv-pk
RUN git clone --recursive https://github.com/camel-cdr/rvv-bench \
&& cd rvv-bench \
&& sed -i 's/_zfh//g' config.mk \
&& cd bench \
&& sed -i 's/.*rdcycle.*//g' bench.h \
&& make
WORKDIR /rvv-bench/bench
RUN spike --isa=rv64gcv /usr/local/riscv64-linux-gnu/bin/pk ./memcpy
Since it removes the rdcycle code it won't report any reasonable timing results, but spike wouldn't have done that anyway, as simulators don't reflect what happens on actual hardware. IIRC qemu just passes through your host cycle counter, I'm not sure how spike is supposed to implement it.
I personally use qemu for testing, so I'd recommend you try that if you can. My personal setup uses the clang
and qemu-user
Debian pages.
What are you using this for specifically, maybe I can help better that way.
Thanks. I successfully implemented your suggestions.
However, I'm wondering how long these test programs might take to run on Spike. For the memcpy
function, we've been running it for over 24 hours now, nd here are some snippets of the output:" :"
bbl loader
{
title: "memcpy",
labels: ["0","musl","scalar","scalar_autovec","rvv_m1","rvv_m2","rvv_m4","rvv_m8","rvv_align_dest_m1","rvv_align_dest_m2","rvv_align_dest_m4","rvv_align_dest_m8","rvv_align_src_m1","rvv_align_src_m2","rvv_align_src_m4","rvv_align_src_m8","rvv_align_dest_hybrid_m1","rvv_align_dest_hybrid_m2","rvv_align_dest_hybrid_m4","rvv_align_dest_hybrid_m8","rvv_tail_m1","rvv_tail_m2","rvv_tail_m4","rvv_tail_m8","rvv_128_m1","rvv_128_m2","rvv_128_m4","rvv_128_m8",],
data: [
[1,4,7,11,15,20,25,31,38,46,55,65,77,91,107,125,145,168,195,225,260,300,345,397,456,524,601,689,790,905,1037,1188,1360,1557,1782,2039,2333,2669,3053,3492,3993,4566,5221,5969,6824,7801,8918,10195,11654,13321,15227,17405,19894,22739,25990,29705,33951,38804,44350,50688,57932,66211,75672,86485,98843,112966,129107,147553,168635,192728,220263,251732,287696,328798,375772,429456,490809,560927,641062,732645,837311,956929,1093636,1249872,1428428,1632492,1865708,2132240,2436848,2784972,3182828,3637520,4157168,4751052,5429776,6205461,7091958,8105097,9262971,10586255,12098580,13826951,15802232,],
[18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,18446744073709551615.9551615,
The default runtime is tuned for getting good measurements on real hardware, ypu can modify the bench/config.h
file. Everything depends on MAX_MEM
.
Hello: I am attempting to execute the bench on spike, and after running 'make all,' I encounter the following problem when attempting to execute the generated executable with spike:
The relevant portion in the log file is as follows:
What could be the cause of this issue, and do you have any suggestions for resolving it? By the way, I'm using the following version of the clang compiler: