ROCm / rocprofiler-compute

Advanced Profiling and Analytics for AMD Hardware
https://rocm.docs.amd.com/projects/omniperf/en/latest/
MIT License
135 stars 49 forks source link

Consistent rocprof run order #304

Closed coleramos425 closed 7 months ago

coleramos425 commented 7 months ago

Is your feature request related to a problem? Please describe. For consistency sake, let's sort rocprof input files before looping execution.

glob.glob order isn't guaranteed to always be the same https://stackoverflow.com/questions/74451703/glob-glob-returns-same-order-in-each-iteration

Additional context See example below:

Run 1

Profiler choice = rocprofv1
omniperf ver: 2.0.0-Tech-Preview
Path: /home1/karl/repos/omniperf/sample/workloads/atest/MI100
Target: MI100
Command: ./vcopy -n 1048576 -b 256
Kernel Selection: None
Dispatch Selection: None
IP Blocks: All
KernelName verbose: 2
Current input file: /home1/karl/repos/omniperf/sample/workloads/atest/MI100/perfmon/pmc_perf_3.txt
RPL: on '240306_140053' from '/opt/rocm-6.0.2' in '/home1/karl/repos/omniperf/sample'
RPL: profiling '""./vcopy -n 1048576 -b 256""'
RPL: input file '/home1/karl/repos/omniperf/sample/workloads/atest/MI100/perfmon/pmc_perf_3.txt'
RPL: output dir '/tmp/rpl_data_240306_140053_220340'
RPL: result dir '/tmp/rpl_data_240306_140053_220340/input0_results_240306_140053'
ROCProfiler: input from "/tmp/rpl_data_240306_140053_220340/input0.xml"
  gpu_index = 
  kernel = 
  range = 
  23 metrics

Run 2

Profiler choice = rocprofv1
omniperf ver: 2.0.0-Tech-Preview
Path: /home1/karl/repos/omniperf/sample/workloads/atest/MI100
Target: MI100
Command: ./vcopy -n 1048576 -b 256
Kernel Selection: None
Dispatch Selection: None
IP Blocks: All
KernelName verbose: 2
Current input file: /home1/karl/repos/omniperf/sample/workloads/atest/MI100/perfmon/pmc_perf_4.txt
RPL: on '240306_140118' from '/opt/rocm-6.0.2' in '/home1/karl/repos/omniperf/sample'
RPL: profiling '""./vcopy -n 1048576 -b 256""'
RPL: input file '/home1/karl/repos/omniperf/sample/workloads/atest/MI100/perfmon/pmc_perf_4.txt'
RPL: output dir '/tmp/rpl_data_240306_140118_224797'
RPL: result dir '/tmp/rpl_data_240306_140118_224797/input0_results_240306_140118'
ROCProfiler: input from "/tmp/rpl_data_240306_140118_224797/input0.xml"
  gpu_index = 
  kernel = 
  range = 
  22 metrics

CC: @koomie