ROCm / omnitrace

Omnitrace: Application Profiling, Tracing, and Analysis
https://rocm.docs.amd.com/projects/omnitrace/en/latest/
MIT License
297 stars 27 forks source link

Missing GPU kernels when using @profile and -b flag #312

Open dwchang79 opened 1 year ago

dwchang79 commented 1 year ago

I am using the @profile and -b flag to try and remove the initial training section of a ML workload so that I can only profile the inference part. That is working, but the problem is the GPU kernels and information are now missing. The call stack shows the functions, but they do not link to the GPU and no GPU devices are shown nor is anything shown running on them.

I have attached two screenshots. One with the entire run (without the profile flags) where the GPU section is shown at the bottom as "HIP Activity Device 2, Queue 0" and a second screenshot where only the inference part is profiled, but the GPU information is now gone.

Thank you. Complete Inference

jrmadsen commented 10 months ago

Try prefixing the command with omnitrace-run -- python3 -m omnitrace -b -- <script> <script-args>. I suspect the later initialization of omnitrace due to the @profile is causing in omnitrace getting initialized after the hip runtime, resulting in omnitrace not getting registered as profiling tool for the HIP runtime.

ppanchad-amd commented 1 week ago

Hi @dwchang79. Has your issue been resolved? If so, please close the ticket. Thanks!