Closed tiandi111 closed 2 months ago
@tiandi111 Apologies for the lack of response. Do you still need assistance with this ticket? If not, please close the ticket. Thanks!
@tiandi111 Closing ticket for now. Please leave a comment if you still need assistance and I will re-open the ticket. Thanks!
I've used rocprofv2 and encountered the same problem stated in this issue.
I'm wondering what is the recommended way to profile multi-gpu code with ROCm-5.6? Also, API form is perfered since I want to control the profiling range.
Omnitrace seems an overkill for me since I only want to trace each gpu kernels.
For example, the output of rocprofv2 in "ROCPROFILER_DISPATCH_TIMESTAMPS_COLLECTION" mode is exactly what I want:
Thanks!