ROCm / rocprofiler-compute

Advanced Profiling and Analytics for AMD Hardware
https://rocm.docs.amd.com/projects/omniperf/en/latest/
MIT License
135 stars 49 forks source link

Manual rocprof join breaks in ROCm 5.2.x #139

Closed coleramos425 closed 1 year ago

coleramos425 commented 1 year ago

I noticed that profiling applications in ROCm 5.2.x causes issues. A peak at verbose debug logs shows that we crash when checking for arch_vgpr and accum_vgpr (two counters added in ROCm 5.3).

It's not the concern I expressed in the original ticket (https://github.com/AMDResearch/omniperf/issues/117#issuecomment-1548683699), but it'll be an easy fix

ROCPRofiler: 167 contexts collected, output directory 
/tmp/rpl_data_230608_132620_2929822/input_results_230608_132620
File '/home/colramos/GitHub/omniperf-pub/workloads/mix_all/mi200/timestamps.csv' is generating
Successfully joined gpu in pmc_perf.csv
Successfully joined grd in pmc_perf.csv
Successfully joined wgr in pmc_perf.csv
Successfully joined lds in pmc_perf.csv
Successfully joined scr in pmc_perf.csv
Traceback (most recent call last):
  File "./src/omniperf", line 917, in <module>
    main()
  File "./src/omniperf", line 812, in main
    omniperf_profile(args, VER)
  File "./src/omniperf", line 698, in omniperf_profile
    join_prof(workload_dir, args.join_type, log, args.verbose)
  File "/home/colramos/GitHub/omniperf-pub/src/utils/perfagg.py", line 136, in join_prof
    if not test_df_column_equality(_df):
  File "/home/colramos/GitHub/omniperf-pub/src/utils/perfagg.py", line 92, in test_df_column_equality
    return df.eq(df.iloc[:, 0], axis=0).all(1).all()
  File "/home/colramos/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 961, in __getitem__
    return self._getitem_tuple(key)
  File "/home/colramos/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1458, in _getitem_tuple
    tup = self._validate_tuple_indexer(tup)
  File "/home/colramos/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 769, in _validate_tuple_indexer
    self._validate_key(k, i)
  File "/home/colramos/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1361, in _validate_key
    self._validate_integer(key, axis)
  File "/home/colramos/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1452, in _validate_integer
    raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds

https://github.com/AMDResearch/omniperf/blob/a346db7646b0a935f4cac51d131b4a585f065c05/src/utils/perfagg.py#L123-L133

coleramos425 commented 1 year ago

Fixed.