Closed dji-transpire closed 3 years ago
There may be a need for some cleanup, it took a while to half-way understand what defaults are used for the different options of mirtCluter such as spec and omp_threads.
Thanks for this, I'll give it a look over when I get a chance on my Linux box. Question though: I see instances where technical$omp <- TRUE
, where I initially had these as FALSE
. My thinking was that mixing OpenMP into R's parallel forking schemes is not a good idea as OpenMP would be scheduled to have more processors specified than are available. Is this behaviour safe, or does OpenMP simply ignore processors that are not available at runtime?
I report the result of a simple speed test with this branch.
Machine: Intel Xeon X5650 * 2 + Ubuntu 19.04 + 96GiB DDR3 ECC RAM + NVBLAS (Nvidia Geforce GTX 1050) + OpenBLAS
# nvidia-smi
Tue Jun 23 00:51:02 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.82 Driver Version: 440.82 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1050 Off | 00000000:0F:00.0 Off | N/A |
| 0% 36C P8 N/A / 70W | 769MiB / 1999MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1530 G /usr/lib/xorg/Xorg 120MiB |
| 0 2033 G xfwm4 2MiB |
| 0 6338 C /usr/lib/rstudio-server/bin/rsession_ 79MiB |
| 0 9785 C /usr/lib/R/bin/exec/R 79MiB |
| 0 10101 C /usr/lib/R/bin/exec/R 79MiB |
| 0 10361 C /usr/lib/R/bin/exec/R 79MiB |
| 0 11052 C /usr/lib/R/bin/exec/R 79MiB |
| 0 20240 C /usr/lib/rstudio-server/bin/rsession_ 79MiB |
| 0 25601 C /usr/lib/R/bin/exec/R 79MiB |
| 0 25698 C /usr/lib/R/bin/exec/R 79MiB |
+-----------------------------------------------------------------------------+
Multiple group model with MHRM (CRAN version, without any mirtCluster()):
> modMG@time
TOTAL: Data MH_draws Mstep SE SE Post
69.690 0.059 9.208 23.253 32.364 0.000 3.313
Multiple group model with MHRM (This version, without any mirtCluster()):
> modMG@time
TOTAL: Data MH_draws Mstep SE SE Post
96.017 0.061 8.068 24.411 58.372 0.000 3.443
Multiple group model with MHRM (2 Thread + 2 OMP thread):
> modMG@time
TOTAL: Data MH_draws Mstep SE SE Post
104.074 0.058 7.620 25.023 67.572 0.000 2.072
Multiple group model with MHRM (2 Thread + 4 OMP thread):
> modMG@time
TOTAL: Data MH_draws Mstep SE SE Post
98.649 0.057 7.203 24.132 63.167 0.000 1.954
Multiple group model with MHRM (2 Thread + 6 OMP thread):
> modMG@time
TOTAL: Data MH_draws Mstep SE SE Post
99.057 0.059 7.655 23.611 64.136 0.000 1.942
Multiple group model with MHRM (6 Thread + 6 OMP thread):
> modMG@time
TOTAL: Data MH_draws Mstep SE SE Post
93.462 0.064 7.979 23.773 59.088 0.000 0.949
Here's a more recent Estep.cpp file