krisrs1128 / gflasso

An experimental package for graph-structured multi-task regression
2 stars 4 forks source link

Paralleling not working #11

Open candelas762 opened 1 year ago

candelas762 commented 1 year ago

Hi. This package seems to have been inactive for some time but I would like to point that the parallelization seems to be not working:

> set.seed(100)
> X <- matrix(rnorm(100 * 10), 100, 10)
> u <- matrix(rnorm(10), 10, 1)
> B <- u %*% t(u) + matrix(rnorm(10 * 10, 0, 0.1), 10, 10)
> Y <- X %*% B + matrix(rnorm(100 * 10), 100, 10)
> R <- ifelse(cor(Y) > .8, 1, 0)
> system.time(testCV <- cv_gflasso(X, Y, R, nCores = 1))
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=17s  
[1] 1.490671 1.612144 1.386832 1.466570 1.278058
   user  system elapsed 
  16.65    0.28   16.97 
> system.time(testCV <- cv_gflasso(X, Y, R, nCores = 2))
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=17s  
[1] 1.490671 1.612144 1.386832 1.466570 1.278058
   user  system elapsed 
  16.47    0.32   16.80 
> system.time(testCV <- cv_gflasso(X, Y, R, nCores = 10))
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=17s  
[1] 1.490671 1.612144 1.386832 1.466570 1.278058
   user  system elapsed 
  16.33    0.39   16.72
krisrs1128 commented 1 year ago

Thank you for reporting this. I hadn't been aware of the parallelization not working. I will take a look when I have a chance, though I can't guarantee it will be soon.

azguan commented 3 months ago

I had the same issue. Looking at the documentation for pblapply this appears to be an issue for Windows machines - setting cl={some integer value} creates a forked cluster for Unix-style machines and reverts to sequential calculation on Windows. To work on both types of systems, cl needs to be explicitly initialized, i.e.:

cl = makeCluster(nCores)
registerDoParallel(cl)
clusterExport(cl, c("gflasso", "cvFUN", "R","rmse))
------ some code ------
allCV <- pbapply::pblapply(...., cl=cl)
stopCluster(cl)