c3sr / comm_scope

NUMA-aware multi-CPU multi-GPU data transfer benchmarks
https://github.com/c3sr/scope
Apache License 2.0
21 stars 3 forks source link

Break benchmarks out into latency and bandwidth #38

Open cwpearson opened 4 years ago

cwpearson commented 4 years ago

Similar to "Evaluating Modern GPU Interconnect: PCIe, NVLink, NV-SLI, NVSwitch and GPUDirect". The bandwidth benchmarks already use cudaEvents to compute the bandwidth, but we could explicitly have a latency measurement, where the transfer size is minimal, and a bandwidth measurement, where the transfer size is larger.

Could break it out into two different benchmarks so the reporting is easier to understand.