Open chanzhennan opened 1 year ago
Thank you for your feedback. I have so far only tried gfx90a and gfx1030 targets, as these are the ones I have available.
The error is somewhere in the performance counter collection, which is of course highly device specific. This data is not necessary for the benchmark, it just provides some further insight. A quick fix is the removal of all lines where it says "meausreXXXBytesStart/Stop".
I will try to add a metric measurement flag to the code to skip this functionality which is only really tested for a few devices.
By the way, I would be very much interested in the results that you get.
I tested this on a machine with a RX6900XT. When I use your build command line, it fails for me with the same error. If I uses the one from the Makefile, it works. Note the difference:
Yours:
hipcc -std=c++20 -I/opt/rocm/include/rocprofiler/ -I/opt/rocm/hsa/include/hsa -L/opt/rocm/rocprofiler/lib -lrocprofiler64 -lrocprofiler64v2 -lhsa-runtime64 -lrocm_smi64 -ldl main.hip -o demo
Makefile:
hipcc -std=c++20 -I/opt/rocm/include/rocprofiler/ -I/opt/rocm/hsa/include/hsa -L/opt/rocm/rocprofiler/lib -lrocprofiler64 -lhsa-runtime64 -ldl -o hip-l2-cache main.hip
There is an additional "-lrocprofiler64v2" in your command line. Removing it made it work for me. It might still be though, that some of the metric names are different for gfx1100 and that it still wont work.
Can you please verify that this actually fixes your problem? Also, like I have said, I would be interested in your results.
can pass in gpu-cache, gpu-metrics, gpu-stream,gpu-strides failed in gpu-l2-cache ,gpu-latency test
/opt/rocm/bin/hipcc -std=c++20 -I/opt/rocm/include/rocprofiler/ -I/opt/rocm/hsa/include/hsa -L/opt/rocm/rocprofiler/lib -lrocprofiler64 -lrocprofiler64v2 -lhsa-runtime64 -lrocm_smi64 -ldl main.hip -o demo ./demo gpu_count 1 Agent 0 data set exec time spread Eff. bw gpu_count 1 Agent 0 measureMetricStop: no kernel kaunch was intercepted make: *** [Makefile:25: test] Segmentation fault (core dumped)