This PR is the initial step towards more power awareness in QUDA, as well as adding OMP threading for host kernels
Adds power, temperature and clock monitoring
Monitoring is enabled with QUDA_ENABLE_MONITOR=1 (default is off)
Monitoring is performed on a spawned thread, maintaining the history in a linked list
Default monitoring period is QUDA_ENABLE_MONITOR_PERIOD=1
Monitor info, together with derived energy usage is dumped to a monitor_*****.tsv file, where the **** encodes the rank id, and the date_time of the dump. All ranks have identical times by construction.
Add OpenMP threading support for all CPU kernels
Most kernels seem to get reasonable scaling
QUDA_OPENMP CMake parameter is no longer marked as advanced
Fixes compiler warning introduced in #1416
Fixes bug introduced in #1416
Fixed an issue with endQuda if memory leaks were detected when running multi-GPU: printfQuda would fail since comm_rank() would be called after the comms have been torn down
This PR is the initial step towards more power awareness in QUDA, as well as adding OMP threading for host kernels
QUDA_ENABLE_MONITOR=1
(default is off)QUDA_ENABLE_MONITOR_PERIOD=1
monitor_*****.tsv
file, where the **** encodes the rank id, and the date_time of the dump. All ranks have identical times by construction.QUDA_OPENMP
CMake parameter is no longer marked as advancedendQuda
if memory leaks were detected when running multi-GPU:printfQuda
would fail sincecomm_rank()
would be called after the comms have been torn down