ROCm / rocprofiler

ROC profiler library. Profiling with perf-counters and derived metrics.
https://rocm.docs.amd.com/projects/rocprofiler/en/latest/
MIT License
132 stars 49 forks source link

Bytes transferred with CopyHostToDevice and CopyDeviceToHost #69

Closed lfmeadow closed 3 months ago

lfmeadow commented 2 years ago

--hip-trace gives the COPY calls but the number of bytes transferred is glaringly missing. Is this in the HSA trace layer? There's a mention of being able to trace specific HSA APIs but I can't find a list of those APIs anywhere. Thanks.

kikimych commented 2 years ago

Could you please clarify what do you mean by number of bytes? copy_stats contains cumulative values. Do you need per call trace information?

lfmeadow commented 2 years ago

Yes, I want per-call information. Similarly I'd like per-call launch information for all kernel launches. They exist for hipModuleLaunchKernel but are blank for hipLaunchKernelGGL. I'm happy to mine this out of the database, I just want the information.

kikimych commented 2 years ago

https://github.com/ROCm-Developer-Tools/rocprofiler/pull/86

Just quick workaround. Add --save-temp option and set temporary directory with -t option. Raw data with all telemetry persist in temporary directory. I think it will be fixed soon.

kikimych commented 2 years ago

I have checked db content. All kernel launches and memory copies exists. Could you please provide small reproducer?

lfmeadow commented 2 years ago

Yes, OK, fair enough, give m a few days.

lfmeadow commented 2 years ago

Well, it took more than a few days. Sorry. I do see that data in the sqlite db for copies:

,BeginNs,EndNs,pid,tid,Name,args,Index,Data,__section,__lane,DurationNs
9164,7148818203319432,7148818203870232,19156,19156,hipMemcpyAsync,( dst(0xf89e20) src(0x7f3480e08000) sizeBytes(3840) kind(4) stream(2)),9165,,2,19156,550800

(Sorry about the long lines I wish github would wrap)

This is from pandas after reading the HIP table from the database. I was expecting to see it in the COPY table

Q: where are the kind() values specified?

I assume the timestamps in the HIP table are from the HIP API layer and the times in the COPY and OP table are from a lower level API, so guess I can just join them all on index.

There is a problem with the detailed data for the hipLaunchKernel API:, it seems that the hipModuleLaunchKernel API is fine.

hipLaunchKernel,"( kernel(void init_kernel<double>(double*, double*, double*, double, double, double)) function_address(0x202620) numBlocks({}) dimBlocks({}) args(0x7fff93154368) sharedMemBytes(0) stream(1))"

numBlocks and dimBlocks are both empty..

For code that uses the hipModuleLaunchKernel API (i.e., SYCL) the two launch parameters are filled in.l

hipModuleLaunchKernel,( kernel(typeinfo name for sycl_kernels::init<double>) f(0x12d7fd0) gridDimX(32768) gridDimY(1) gridDimZ(1) blockDimX(1024) blockDimY(1) blockDimZ(1) sharedMemBytes(0) stream(2) kernelParams(0x8bd628) extra(NULL))

You can reproduce the HIP problem with https://github.com/UoB-HPC/BabelStream or with pretty much any HIP code I'm sure. I see code like this:

  hipLaunchKernelGGL(HIP_KERNEL_NAME(init_kernel<T>), dim3(array_size/TBSIZE), dim3(TBSIZE), 0, 0, d_a, d_b, d_c, initA, initB, initC);

See https://github.com/UoB-HPC/BabelStream/blob/main/src/hip/HIPStream.cpp#L97-L104

Thanks and sorry for the long delay in response, I'm just getting back to the ROCM tools.

kikimych commented 2 years ago

Hi, hipMemcpyKind enum is defined here: https://github.com/ROCm-Developer-Tools/HIP/blob/develop/include/hip/driver_types.h#L344

harkgill-amd commented 3 months ago

Hi @lfmeadow, are you still encountering this issue with the latest ROCm 6.2 release?

lfmeadow commented 3 months ago

I really don't know, this was so long ago. And now I work for AMD :) I will close the issue.

lfmeadow commented 3 months ago

The issue is no longer relevant.