ROCm / roctracer

ROCm Tracer Callback/Activity Library for Performance tracing AMD GPUs
https://rocm.docs.amd.com/projects/roctracer/en/latest/
Other
69 stars 30 forks source link

The rocm-3.9.x version of roctracer reports all kinds of memcpy as CopyDeviceToDevice when using pinned memory #45

Closed ipe-zhangyz closed 3 years ago

ipe-zhangyz commented 3 years ago

Hello, I'm using ROCm 3.9.1. I find that if the host memory is pinned memory, i.e. allocated using hipHostMalloc, when tracing the code using rocprof --hip-trace, all the hipMemcpyHostToDevice and hipMemcpyDeviceToHost operations will be reported as CopyDeviceToDevice. If the host memory is unpinned, the H2D and D2H memcpy operations can be reported correctly.
In addition, it seems that the memory copy operations are still treated in the HCC relative codes in the rocm-3.9.x version of roctracer, which makes me confused, as there should be no HCC runtime in ROCm 3.9. I find that there are many operations with the name "Unknown command type" when using the rocm-3.9.x version of roctracer, which do not appear when tracing the same code using the roc-3.3.x version. I don't know if it is caused by the using of streams and events, as there is no such unknown commands when tracing the super simple MatrixTranspose case in which there is no use of streams and events.