Closed ronawho closed 5 years ago
I have a prototype branch that enables dynamic pinning/unpinning at runtime. It's available at https://github.com/ronawho/chapel/tree/qthread-dynamic-pinning
It can be used for NPB with:
diff --git a/src/DistributedFFT.chpl b/src/DistributedFFT.chpl
index 1d9844a..9824433 100644
--- a/src/DistributedFFT.chpl
+++ b/src/DistributedFFT.chpl
@@ -149,8 +149,10 @@ prototype module DistributedFFT {
if !warmUpOnly {
if !plan_yz.isValid then
halt("Error! Plan generation failed! Did you run the warmup routine?");
- // Execute the plan
+ // Execute the plan (disabling affinity to avoid interference)
+ chpl_disableAffinity();
plan_yz.execute();
+ chpl_enableAffinity();
}
// on loc ends
Here's some performance measurements with it for NPB-FT size D on Swan:
Locales | always pin | never pin | unpin for fftw |
---|---|---|---|
1 | 600s | 200s | 158s |
8 | 165s | 80s | 66s |
Always pinning kills the multi-threaded fftw computation, and never pinning
hurts the Chapel code. The unpin just for fftw is a nice middle ground, but I'm
still curious to see if we can implement doFFT_YZ()
with Chapel tasks.
The SerialFFTW
branch now just uses the serial FFT code with Chapel handling all parallelism, and gets similar performance to the pthread backend.
I think we can close this now that SerialFFTW is merged with master. I've updated https://github.com/chapel-lang/chapel/issues/9882 with a pointer to my branch. I think we still want to address that longer term, but I don't think it's required for this anymore.
Currently, we're running this benchmark with
QT_AFFINITY=no
QT_SPINCOUNT=300
in order to prevent Chapel's threads and FFTW's threads from trampling each other. This is https://github.com/chapel-lang/chapel/issues/9882 upstream.Disabling affinity improves the multi-threaded FFTW, but hurts the performance of the native Chapel code. I think we either need a way to turn affinity on and off at runtime, or we need to avoid calling out to multi-threaded FFTW.
To Do--
doFFT_YZ
with single threaded FFTW and Chapel handling the parallelism.doFFT_YZ
with manual pinning and unpinning of Chapel threads