NVIDIA / nvbench

CUDA Kernel Benchmarking Library
Apache License 2.0
528 stars 66 forks source link

[FEA] Add `nvbench::exec_tag::host` to support CPU-only benchmarks #95

Open sleeepyjack opened 2 years ago

sleeepyjack commented 2 years ago

Nvbench currently does not support benchmarking CPU-only code natively. Although adding nvbench::exec_tag::sync gives plausible measurements for cold runs, there is no mechanism for batch measurements. We could enable this feature by e.g. adding a distinct exec tag nvbench::exec_tag::host.

jrhemstad commented 1 year ago

Note that using exec_tag::sync isn't really reliable for CPU-only benchmarks because it still uses CUDA events for timing. This works, but it's a little hacky.

The main things a exec_tag::host would need to do:

cliffburdick commented 5 months ago

Can this one get a bump given Grace is a common use case now?