Closed JoseSantosAMD closed 3 weeks ago
ah, the dataframes are entirely empty by the time we read them here: https://github.com/AMDResearch/omniperf/blob/4a8917f8802dabe995294390730796fcddfe3017/src/omniperf_profile/profiler_base.py#L113
probably just need a check for that and helpful error message as Jose suggests
I get this error without --dispatch
or --kernel
specified:
$ omniperf profile -n wcsph -- ~/.juliaup/bin/julia --project=run ./benchmarks/gpu.jl
___ _ __
/ _ \ _ __ ___ _ __ (_)_ __ ___ _ __ / _|
| | | | '_ ` _ \| '_ \| | '_ \ / _ \ '__| |_
| |_| | | | | | | | | | | |_) | __/ | | _|
\___/|_| |_| |_|_| |_|_| .__/ \___|_| |_|
|_|
INFO Omniperf version: 2.0.1
INFO Profiler choice: rocprofv1
INFO Path: /home/efaulha2/git/PointNeighbors.jl/workloads/wcsph/MI200
INFO Target: MI200
INFO Command: /home/efaulha2/.juliaup/bin/julia --project=run ./benchmarks/gpu.jl
INFO Kernel Selection: None
INFO Dispatch Selection: None
INFO Hardware Blocks: All
INFO
INFO ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
INFO Collecting Performance Counters
INFO ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[...]
INFO |-> [rocprof] File '/home/efaulha2/git/PointNeighbors.jl/workloads/wcsph/MI200/timestamps.csv' is generating
INFO |-> [rocprof]
Traceback (most recent call last):
File "/home/efaulha2/omniperf/2.0.1/bin/omniperf", line 138, in <module>
main()
File "/home/efaulha2/omniperf/2.0.1/bin/omniperf", line 126, in main
omniperf.run_profiler()
File "/home/efaulha2/omniperf/2.0.1/libexec/omniperf/utils/utils.py", line 45, in wrap_function
result = function(*args, **kwargs)
File "/home/efaulha2/omniperf/2.0.1/libexec/omniperf/omniperf_base.py", line 229, in run_profiler
profiler.post_processing()
File "/home/efaulha2/omniperf/2.0.1/libexec/omniperf/utils/utils.py", line 45, in wrap_function
result = function(*args, **kwargs)
File "/home/efaulha2/omniperf/2.0.1/libexec/omniperf/omniperf_profile/profiler_rocprof_v1.py", line 85, in post_processing
self.join_prof()
File "/home/efaulha2/omniperf/2.0.1/libexec/omniperf/utils/utils.py", line 45, in wrap_function
result = function(*args, **kwargs)
File "/home/efaulha2/omniperf/2.0.1/libexec/omniperf/omniperf_profile/profiler_base.py", line 124, in join_prof
key = _df.groupby(["Kernel_Name", "Grid_Size"]).cumcount()
File "/home/efaulha2/.local/lib/python3.9/site-packages/pandas/core/frame.py", line 9183, in groupby
return DataFrameGroupBy(
File "/home/efaulha2/.local/lib/python3.9/site-packages/pandas/core/groupby/groupby.py", line 1329, in __init__
grouper, exclusions, obj = get_grouper(
File "/home/efaulha2/.local/lib/python3.9/site-packages/pandas/core/groupby/grouper.py", line 1043, in get_grouper
raise KeyError(gpr)
KeyError: 'Grid_Size'
@JoseSantosAMD Issue is fixed. Closing ticket. Thanks!
Describe the bug In an attempt to break Omniperf I found some commands that result in a Grid_Size error shows up when filtering for out of range or invalid device/kernel
Development Environment:
Expected behavior Graceful exit with helpful output