eth-easl / orion

An interference-aware scheduler for fine-grained GPU sharing
MIT License
97 stars 15 forks source link

Question about measuring Compute throughput utilization #34

Closed atomicapple0 closed 4 months ago

atomicapple0 commented 4 months ago

Hi,

How were the plots of GPU Compute Throughput Utilization and GPU Memory Bandwidth Utilization in figures 1,8,9 in the paper generated? Were these metrics recorded with Nsight Compute, Cupti, Nvml, or through some different api? Can this script be shared if possible?

Best, Brian

fotstrt commented 4 months ago

Hi Brian,

We used a combination of the Nsight Systems and Nsight Compute tools. More specifically, we took the trace of execution from Nsight Systems (i.e. start and end of each kernel) and for each kernel, its compute/memory utilization via Nsight Compute.

All scripts that we used are available in https://github.com/eth-easl/orion/tree/main/profiling

Instructions on how to use the scripts are in https://github.com/eth-easl/orion/blob/main/PROFILE.md. For generating the plots, you need all steps up to this: https://github.com/eth-easl/orion/blob/main/PROFILE.md#optional-plot-traces

I hope that helps!

atomicapple0 commented 4 months ago

Thank you! This is very helpful. :D