Closed code-fool closed 3 months ago
Hi @code-fool, I was not able to reproduce the issue you are seeing with the latest release of rocprof in ROCm 6.2. The parameters used in my test were the following.
# Perf counters group 1
pmc: MemUnitStalled,TCC_MISS[0]
# Filter by dispatches range, GPU index and kernel names
# supported range formats: "3:9", "3:", "3"
range: 0:1
gpu: 0
kernel: matrixTranspose
I then compiled and ran the MatrixTranpose sample with profiling using rocprof -i input.txt ./MatrixTranspose
. After completion, input.csv
resulted in the following:
Index,KernelName,gpu-id,queue-id,queue-index,pid,tid,grd,wgr,lds,scr,arch_vgpr,accum_vgpr,sgpr,wave_size,sig,obj,MemUnitStalled,TCC_MISS[0]
0,"matrixTranspose(float*, float*, int) [clone .kd]",2,0,1,1231427,1231427,1048576,16,0,0,8,8,16,64,0x0,0x705afa684880,0.0000000000,0
I will close out this ticket. If you are still encountering this issue after following the steps above, please comment your findings and I will re-open this ticket. Thanks!
i can not get the result like this picture say