dask / distributed

A distributed task scheduler for Dask
https://distributed.dask.org
BSD 3-Clause "New" or "Revised" License
1.57k stars 717 forks source link

Include CUDA stack information in profile plots #2578

Open pentschev opened 5 years ago

pentschev commented 5 years ago

We're in the process of include low-level (e.g., C calls) profiling plots in https://github.com/dask/distributed/pull/2575. After that it may be useful to do the same with CUDA, we could add both host-side calls (e.g., memory allocation) and kernel calls.

This can be done with the help of CUPTI - CUDA Profiling Tools Interface. Perhaps the C side could (and maybe, should) be part of a small library decoupled from this repository, just like https://github.com/numba/stacktrace.

mrocklin commented 5 years ago

cc @kkraus14 would this be helpful? cc @sklam @seibert any thoughts?

sklam commented 5 years ago

On the GPU, I think you want to know:

pentschev commented 5 years ago

On the GPU, I think you want to know:

  • What the GPU is doing? H<->D memcpy? Kernel call?
  • If kernel call? what's the symbol name of the kernel?

Exactly, those would be the most important information, there would be other calls that may be useful too, like cudaStreamSynchronize() and the likes.

kkraus14 commented 5 years ago

I believe the libcudf developers would love this for understanding libcudf performance within a Dask-cudf workload.

mrocklin commented 5 years ago

Is there anyone that we know that would be willing to try it out?

On Fri, Apr 12, 2019 at 4:39 PM Keith Kraus notifications@github.com wrote:

I believe the libcudf developers would love this for understanding libcudf performance within a Dask-cudf workload.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/dask/distributed/issues/2578#issuecomment-482731763, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszFMlhmcO0b4IK_IIHkhNZ3c7EMHIks5vgP0sgaJpZM4cJacR .