celeritas-project / celeritas

Celeritas is a new Monte Carlo transport code designed to accelerate scientific discovery in high energy physics by improving detector simulation throughput and energy efficiency using GPUs.
https://celeritas-project.github.io/celeritas/user/index.html
Other
58 stars 32 forks source link

Integrate Perfetto for optional CPU profiling #1224

Closed sethrj closed 1 month ago

sethrj commented 2 months ago

It looks like Perfetto, the Google performance/tracing tool, could be useful to integrate into Celeritas to measure performance. It might be straightforward to wrap their tracing SDK underneath ScopedProfiling.

esseivaju commented 1 month ago

After looking at more Perfetto features, I think integrating it would be useful, at least supporting the basic slices and counters events.

First, we still have the ScopedProfiling feature of recording and displaying function calls on a timeline, which can be used if CUDA/HIP is unavailable. Perfetto also supports counters, e.g. we could have an active track counter updated each step which would be displayed on a timeline (similar to the memory usage below).

image

Then, if we want to support Linux ftrace API (requires root privilege), we can find out anything that the kernel is doing: e.g. which processes/threads are scheduled on which CPU cores, memory usage, enter/exit a syscall.

The first picture below illustrates which threads are scheduled on CPU5, the status of these threads, and their syscalls. The second picture shows the memory usage for the celer-sim process only.

image image

In addition to visualization on the Web Interface, there is also a SQL interface ( accessible via either a Python API or the web interface). For example, we can compute the average time per Celeritas action:

image
sethrj commented 1 month ago

That's awesome @esseivaju ! Let's do it. I'm curious how this Google-developed tool compares to the in-house CERN performance tools.

esseivaju commented 1 month ago

Are you referring to AdaptivePerf?