ROCm / rocprofiler-compute

Advanced Profiling and Analytics for AMD Hardware
https://rocm.docs.amd.com/projects/omniperf/en/latest/
MIT License
135 stars 49 forks source link

Add embedded performance profiling capability. #178

Open coleramos425 opened 1 year ago

coleramos425 commented 1 year ago

Describe the suggestion Add embedded performance profiling capability.

Justification Optional capability to enable reporting of execution time required across major functions within OmniPerf . Useful for ongoing development optimization and performance regression detection.

Implementation Include a timer class that can be used to demarcate start/stop for regions of interest and aggregate wall-clock execution.

Additional Notes Integrate with logger option mentioned above or have separate command-line argument to enable.

Originally posted by @koomie in https://github.com/AMDResearch/omniperf/discussions/153#discussioncomment-6630057