Closed Luke20000429 closed 1 year ago
This looks like a bug in the correlation IDs provided by the roctracer library. My guess is that roctracer is giving OmniTrace the same correlation ID everytime so Perfetto is connecting all of them.
Can you click on one of the entries with arrows and, in the debug annotations in the lower right of the details at the bottom, there should be a "corr_id" arg with a value -- if you right-click and there should be an option for something like "Find all arg with same value"? My guess is that will return a lot of results.
If possible, could you verify this behavior doesn't exist in ROCm 5.3? If not and the code producing this is relatively simple, please just fork and drop it into a subdirectory in the examples folder and LMK so I can try to reproduce it.
Thanks for your quick response, I checked the corr_id
of several entries but it seems that each corr_id
is only shared by two entries.
For example, one of the CopyDeviceToHost has corr_id=295
. The same id is only used by the API call.
I didn't install ROCm 5.3, so I might switch to another workstation. If that doesn't work, I will add a simple demo to my forked repo.
I didn't install ROCm 5.3, so I might switch to another workstation. If that doesn't work, I will add a simple demo to my forked repo.
Ah no need, I located the problem. I was accidentally using the internal correlation id for critical tracing (which intentionally makes the connections you are seeing) instead of using the roctracer correlation id. I didn't realize that was the case bc I did a very poor job naming the variables: the former variable name was _cid
and the latter variable name was _corr_id
. I'll get that fixed and generate a v1.9.1 release.
Sounds great!
I write a program which will launch kernels on 4 independent streams. However, when I profile that with omnitrace the flow event of memcpyD2H looks like this I thought these memcpy should be independent from each other, could someone explain why this happens? I am running on ROCM 5.4.3 with omnitrace 1.8.0 and gfx1030.