This PR adds two shell scripts for performance analysis based on the built-in trace events in XRT. After installation of the amdxdna plug-in package, these scripts can be found under /opt/xilinx/xrt/amdxdna. They rely on 'perf' command on Linux, so it has to be available in your PATH env.
npu_perf_trace.sh: this script can be used to run an XRT application. It will enable all necessary trace events, collect perf data and convert to log file for further analysis.
npu_perf_analyze.sh: this script can be used to analyze the output from npu_perf_trace.sh. It can take two events specified by user, parse the output log and calculate the average time difference. User can also specify a range of the log for better understanding the trend of the performance.
Example:
Let's first collect performance data from xrt-smi validate -r latency test
# /opt/xilinx/xrt/amdxdna/npu_perf_trace.sh /opt/xilinx/xrt/bin/xrt-smi validate -d -r latency
[INFO]: Found NPU device 0000:c5:00.1 at /sys/kernel/debug/accel
[INFO]: XRT SDT is removed
[INFO]: XRT SDT is added
[INFO]: perf record -e amdxdna_trace:* -e sdt_xrt:* -a /opt/xilinx/xrt/bin/xrt-smi validate -d -r latency
Validate Device : [0000:c5:00.1]
Platform : RyzenAI-npu4
Power Mode : Default
-------------------------------------------------------------------------------
Verbose: Enabling Verbosity
Test 1 [0000:c5:00.1] : latency
Description : Run end-to-end latency test
Xclbin : /opt/xilinx/xrt/amdxdna/bins/17f0_10/validate.xclbin
Details : Kernel name is 'DPU_PDI_0'
Instruction size: '20' bytes
No. of iterations: '10000'
Average latency: '46.4' us
Test Status : [PASSED]
-------------------------------------------------------------------------------
Validation completed
[ perf record: Woken up 65 times to write data ]
[ perf record: Captured and wrote 17.190 MB perf.data (170133 samples) ]
[INFO]: XRT SDT is removed
Now, let's take a look at average time between xrt::run.start() and xrt::run.wait2() (skipping the first 100 events since they may be slower due to CPU frequence ramping up)
This PR adds two shell scripts for performance analysis based on the built-in trace events in XRT. After installation of the amdxdna plug-in package, these scripts can be found under /opt/xilinx/xrt/amdxdna. They rely on 'perf' command on Linux, so it has to be available in your PATH env.
Example:
Let's first collect performance data from xrt-smi validate -r latency test
Now, let's take a look at average time between xrt::run.start() and xrt::run.wait2() (skipping the first 100 events since they may be slower due to CPU frequence ramping up)