Closed lfmeadow closed 3 months ago
Could you please clarify what do you mean by number of bytes? copy_stats contains cumulative values. Do you need per call trace information?
Yes, I want per-call information. Similarly I'd like per-call launch information for all kernel launches. They exist for hipModuleLaunchKernel but are blank for hipLaunchKernelGGL. I'm happy to mine this out of the database, I just want the information.
https://github.com/ROCm-Developer-Tools/rocprofiler/pull/86
Just quick workaround. Add --save-temp option and set temporary directory with -t option. Raw data with all telemetry persist in temporary directory. I think it will be fixed soon.
I have checked db content. All kernel launches and memory copies exists. Could you please provide small reproducer?
Yes, OK, fair enough, give m a few days.
Well, it took more than a few days. Sorry. I do see that data in the sqlite db for copies:
,BeginNs,EndNs,pid,tid,Name,args,Index,Data,__section,__lane,DurationNs
9164,7148818203319432,7148818203870232,19156,19156,hipMemcpyAsync,( dst(0xf89e20) src(0x7f3480e08000) sizeBytes(3840) kind(4) stream(2)),9165,,2,19156,550800
(Sorry about the long lines I wish github would wrap)
This is from pandas after reading the HIP table from the database. I was expecting to see it in the COPY table
Q: where are the kind() values specified?
I assume the timestamps in the HIP table are from the HIP API layer and the times in the COPY and OP table are from a lower level API, so guess I can just join them all on index.
There is a problem with the detailed data for the hipLaunchKernel API:, it seems that the hipModuleLaunchKernel API is fine.
hipLaunchKernel,"( kernel(void init_kernel<double>(double*, double*, double*, double, double, double)) function_address(0x202620) numBlocks({}) dimBlocks({}) args(0x7fff93154368) sharedMemBytes(0) stream(1))"
numBlocks
and dimBlocks
are both empty..
For code that uses the hipModuleLaunchKernel
API (i.e., SYCL) the two launch parameters are filled in.l
hipModuleLaunchKernel,( kernel(typeinfo name for sycl_kernels::init<double>) f(0x12d7fd0) gridDimX(32768) gridDimY(1) gridDimZ(1) blockDimX(1024) blockDimY(1) blockDimZ(1) sharedMemBytes(0) stream(2) kernelParams(0x8bd628) extra(NULL))
You can reproduce the HIP problem with https://github.com/UoB-HPC/BabelStream or with pretty much any HIP code I'm sure. I see code like this:
hipLaunchKernelGGL(HIP_KERNEL_NAME(init_kernel<T>), dim3(array_size/TBSIZE), dim3(TBSIZE), 0, 0, d_a, d_b, d_c, initA, initB, initC);
See https://github.com/UoB-HPC/BabelStream/blob/main/src/hip/HIPStream.cpp#L97-L104
Thanks and sorry for the long delay in response, I'm just getting back to the ROCM tools.
Hi, hipMemcpyKind enum is defined here: https://github.com/ROCm-Developer-Tools/HIP/blob/develop/include/hip/driver_types.h#L344
Hi @lfmeadow, are you still encountering this issue with the latest ROCm 6.2 release?
I really don't know, this was so long ago. And now I work for AMD :) I will close the issue.
The issue is no longer relevant.
--hip-trace gives the COPY calls but the number of bytes transferred is glaringly missing. Is this in the HSA trace layer? There's a mention of being able to trace specific HSA APIs but I can't find a list of those APIs anywhere. Thanks.