Closed halfflat closed 3 years ago
ARB_NUM_THREADS
. Is that a bug?std::thread::hardware_concurrency()
by default?I would like to remove support for OMP_NUM_THREADS
, having pushed for it multiple times in the past and been shot down.
Using environment variables to set GPU and thread count should be opt-in, like it is in the proposal, and accessed via something like arbenv::default_gpu()
, arbenv::default_concurrency
, etc.
We originally set the default thread count that according to the environment/OS, and removed it.
What was the reason for moving back to a default thread count of 1?
std::thread::hardware_concurrency()
is something we can trust, outside of MPI contexts.
Strongly held opinion: by default I expect context()
to default to the number of threads (on a local machine). Maybe context(all_cores=True)
?
To answer the question regarding std::thread::hardware_concurrency()
, it's not as trustworthy as it looks, sadly.
A default value of one for the thread count is safe: it makes it much easier to compare performance across different systems, as automatic determination of available threads is always a gamble; and while in some contexts a user might want to use all available hardware threads, in others they may want to use only a subset (e.g. multiple MPI ranks per node, or to avoid SMT), or even oversubscribe. It avoids hard to debug and hard to reproduce system-specific issues with thread count determination.
When a user wishes to use more than one thread, it should be an active choice.
Goal: allow simple environment-based determination of resources for an execution context from
arborenv
.The motivations are: reduce bolierplate for querying environment variables and threading environment; provide consistent environment variables for user code that takes advantage of this functionality; ease unit testing in multithreaded and GPU contexts.
Proposal:
OMP_NUM_THREADS
from the environment check.ARB_NUM_THREADS
renamed toARBENV_NUM_THREADS
to make it clear it is a functionality from the arbenv library.arbenv::default_concurrency
that wraps the environment-check-or-else-thread-count-from-system code.ARBENV_GPU_ID
; if set, and we have GPU support, and it’s within the device count, we use that value inarbenv::default_gpu()
. A value <0 would effectively mean: do not use the GPU.In the future we can add a facility for marking unit tests that are specific to GPU or multithreaded functionality, so we can filter for them at invocation time.
Related: #982, #983