philchalmers / mirt

Multidimensional item response theory
https://philchalmers.github.io/mirt/
201 stars 75 forks source link

Added some more complete OPENMP code #183

Closed dji-transpire closed 3 years ago

dji-transpire commented 4 years ago

Here's a more recent Estep.cpp file

dji-transpire commented 4 years ago

There may be a need for some cleanup, it took a while to half-way understand what defaults are used for the different options of mirtCluter such as spec and omp_threads.

philchalmers commented 4 years ago

Thanks for this, I'll give it a look over when I get a chance on my Linux box. Question though: I see instances where technical$omp <- TRUE, where I initially had these as FALSE. My thinking was that mixing OpenMP into R's parallel forking schemes is not a good idea as OpenMP would be scheduled to have more processors specified than are available. Is this behaviour safe, or does OpenMP simply ignore processors that are not available at runtime?

seonghobae commented 4 years ago

I report the result of a simple speed test with this branch.

Machine: Intel Xeon X5650 * 2 + Ubuntu 19.04 + 96GiB DDR3 ECC RAM + NVBLAS (Nvidia Geforce GTX 1050) + OpenBLAS

# nvidia-smi
Tue Jun 23 00:51:02 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.82       Driver Version: 440.82       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1050    Off  | 00000000:0F:00.0 Off |                  N/A |
|  0%   36C    P8    N/A /  70W |    769MiB /  1999MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1530      G   /usr/lib/xorg/Xorg                           120MiB |
|    0      2033      G   xfwm4                                          2MiB |
|    0      6338      C   /usr/lib/rstudio-server/bin/rsession_         79MiB |
|    0      9785      C   /usr/lib/R/bin/exec/R                         79MiB |
|    0     10101      C   /usr/lib/R/bin/exec/R                         79MiB |
|    0     10361      C   /usr/lib/R/bin/exec/R                         79MiB |
|    0     11052      C   /usr/lib/R/bin/exec/R                         79MiB |
|    0     20240      C   /usr/lib/rstudio-server/bin/rsession_         79MiB |
|    0     25601      C   /usr/lib/R/bin/exec/R                         79MiB |
|    0     25698      C   /usr/lib/R/bin/exec/R                         79MiB |
+-----------------------------------------------------------------------------+

Multiple group model with MHRM (CRAN version, without any mirtCluster()):

> modMG@time
  TOTAL:     Data MH_draws    Mstep       SE       SE     Post 
  69.690    0.059    9.208   23.253   32.364    0.000    3.313 

Multiple group model with MHRM (This version, without any mirtCluster()):

> modMG@time
  TOTAL:     Data MH_draws    Mstep       SE       SE     Post 
  96.017    0.061    8.068   24.411   58.372    0.000    3.443

Multiple group model with MHRM (2 Thread + 2 OMP thread):

> modMG@time
  TOTAL:     Data MH_draws    Mstep       SE       SE     Post 
 104.074    0.058    7.620   25.023   67.572    0.000    2.072

Multiple group model with MHRM (2 Thread + 4 OMP thread):

> modMG@time
  TOTAL:     Data MH_draws    Mstep       SE       SE     Post 
  98.649    0.057    7.203   24.132   63.167    0.000    1.954

Multiple group model with MHRM (2 Thread + 6 OMP thread):

> modMG@time
  TOTAL:     Data MH_draws    Mstep       SE       SE     Post 
  99.057    0.059    7.655   23.611   64.136    0.000    1.942

Multiple group model with MHRM (6 Thread + 6 OMP thread):

> modMG@time
  TOTAL:     Data MH_draws    Mstep       SE       SE     Post 
  93.462    0.064    7.979   23.773   59.088    0.000    0.949