Open marioroy opened 8 months ago
@llvm/issue-subscribers-openmp
Author: Mario Roy (marioroy)
Did you explore OMP_WAIT_POLICY
?
Just now. Thank you, for the suggestion. I'm unable to see any difference for clang
including nvc
with power consumption.
OMP_WAIT_POLICY=passive ./primes1.clang 1e10 -p >/dev/null
OMP_WAIT_POLICY=passive ./primes1.nvc 1e10 -p >/dev/null
Still seeing near 6400% CPU utilization versus less than 200% running primes1.gcc.
Interestingly, primes1.gcc (GNU gcc) supports OMP_WAIT_POLICY and can see active and passive (default) working.
OMP_WAIT_POLICY=active ./primes1.gcc 1e10 -p >/dev/null 6400% CPU utilization
OMP_WAIT_POLICY=passive ./primes1.gcc 1e10 -p >/dev/null <200%
This is by design. The OpenMP threads spin wait, because they are much more responsive and wake up quicker when needed again. OMP_WAIT_POLICY=passive
means that the threads go into a deep-sleep mode where they need to wake up via an OS signal, which has much more latency.
There's a default timeout for thread that are spin-waiting to go to that deep sleep state. The default is about 200ms, but you should be able to change that via KMP_BLOCKTIME=50
, which would set the spin-wait timeout to about 50ms.
Thank you. Unfortunately, using clang/clang++, I'm unable to see a difference setting OMP_WAIT_POLICY=passive
. Top reports near 6400% CPU utilization equating to high power consumption. I'm grateful for the ability to set passive
, but does not seem to work using clang.
Hm, OK. Please try the explicit forms:
OMP_WAIT_POLICY Decides whether threads spin (active) or yield (passive) while they are waiting. OMP_WAIT_POLICY=active is an alias for KMP_LIBRARY=turnaround, and OMP_WAIT_POLICY=passive is an alias for KMP_LIBRARY=throughput.
Does that change things?
No change.
KMP_LIBRARY=turnaround ./primes1.clang 1e10 -p >/dev/null 6400% CPU utilization
OMP: Warning #182: OMP_WAIT_POLICY: ignored because KMP_LIBRARY has been defined
KMP_LIBRARY=throughput ./primes1.clang 1e10 -p >/dev/null 6400%
OMP: Warning #182: OMP_WAIT_POLICY: ignored because KMP_LIBRARY has been defined
@jpeyton52 Could you have a look at this at some point and see if there's a bug?
[OpenMP] OMP_WAIT_POLICY=PASSIVE still keeps the threads without work running https://github.com/llvm/llvm-project/issues/63732
I played around with GNU GCC. The OMP_WAIT_POLICY
implementation behaves as described.
OMP_WAIT_POLICY – How waiting threads are handled in GNU GCC
" Description:
Specifies whether waiting threads should be active or passive. If the value is PASSIVE, waiting threads should not consume CPU power while waiting; while the value is ACTIVE specifies that they should. If undefined, threads wait actively for a short time before waiting passively. "
./primes1.gcc 1e10 -p >/dev/null 172% CPU Utilization
OMP_WAIT_POLICY=passive ./primes1.gcc 1e10 -p >/dev/null 133%
OMP_WAIT_POLICY=active ./primes1.gcc 1e10 -p >/dev/null 6400%
Re-posting from https://forums.developer.nvidia.com/t/openmp-excessive-power-consumption-for-waiting-threads/279272
"The OpenMP power consumption test is with the
-p
argument toprimes1
orprimes3
which involves ordered output or one thread writing output at a time. Other threads wait their turn, orderly. I expect for the waiting threads to be idle or consume low CPU utilization. That is not the case and seeing full 6400% CPU utilization (AMD Threadripper 3970X - 64 logical CPU threads) for printing prime numbers to /dev/null. Nothing like GNU GCC consuming just173% for the same test."I see also, near 6400% CPU utilization using clang for the power consumption test, during orderly output.
Prime Demos
OpenMP Ordered Power Consumption Test
OpenMP Performance Test
I first witnessed the power consumption issue using Codon.
https://github.com/exaloop/codon/issues/456
Is it okay for waiting threads to be spinning the CPU during ordered or exclusive blocks? I wonder about cloud customers possibly paying extra power consumption simply for threads waiting their turn. The Intel oneAPI compilers are also impacted.