Open pentschev opened 5 years ago
cc @kkraus14 would this be helpful? cc @sklam @seibert any thoughts?
On the GPU, I think you want to know:
On the GPU, I think you want to know:
- What the GPU is doing? H<->D memcpy? Kernel call?
- If kernel call? what's the symbol name of the kernel?
Exactly, those would be the most important information, there would be other calls that may be useful too, like cudaStreamSynchronize()
and the likes.
I believe the libcudf developers would love this for understanding libcudf performance within a Dask-cudf workload.
Is there anyone that we know that would be willing to try it out?
On Fri, Apr 12, 2019 at 4:39 PM Keith Kraus notifications@github.com wrote:
I believe the libcudf developers would love this for understanding libcudf performance within a Dask-cudf workload.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/dask/distributed/issues/2578#issuecomment-482731763, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszFMlhmcO0b4IK_IIHkhNZ3c7EMHIks5vgP0sgaJpZM4cJacR .
We're in the process of include low-level (e.g., C calls) profiling plots in https://github.com/dask/distributed/pull/2575. After that it may be useful to do the same with CUDA, we could add both host-side calls (e.g., memory allocation) and kernel calls.
This can be done with the help of CUPTI - CUDA Profiling Tools Interface. Perhaps the C side could (and maybe, should) be part of a small library decoupled from this repository, just like https://github.com/numba/stacktrace.