ck37 / varimpact

Variable importance through targeted causal inference, with Alan Hubbard
57 stars 13 forks source link

Parallel computing issues #2

Closed ahubb40 closed 7 years ago

ahubb40 commented 8 years ago

Hi Chris,

When I use foreach and registercluster for another function before varImpact, and I use the parallel=T option in varImpact, I get an error:

Error in summary.connection(connection) : invalid connection Calls: ... sendData.SOCKnode -> serialize -> summary -> summary.connection.

My previous code before calling varImpact looks like this:

  cl <- makeCluster(V)
  registerDoParallel(cl)
  fit.test=origami_SuperLearner(Y = Y, X = Xdat, SL.library =SL.library, method = method.NNLS(), family = gaussian(),V=2)

stopCluster(cl)

ck37 commented 8 years ago

It looks to me like the issue is that "cl" is being registered as the cluster via registerDoParallel, but then stopCluster(cl) is deleting that cluster. So then when varImpact() tries to use "cl" it doesn't work. If you comment out the stopCluster() line does that fix it?

ahubb40 commented 8 years ago

I'll try again, but I think that was one of my permutations that didn't work (I suspected that might be the problem given what I read online, but I think I had the same issue). Just to confirm, it all works fine if I use parallel=F, so that must be the issue. I'll give this a try just to make sure and see if it fails or not.

Alan Hubbard Division of Biostatistics UC Berkeley (510)643-6160 http://hubbard.berkeley.edu

On Sat, Jul 9, 2016 at 8:30 AM, Chris Kennedy notifications@github.com wrote:

It looks to me like the issue is that "cl" is being registered as the cluster via registerDoParallel, but then stopCluster(cl) is deleting that cluster. So then when varImpact() tries to use "cl" it doesn't work. If you comment out the stopCluster() line does that fix it?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ck37/varImpact/issues/2#issuecomment-231539982, or mute the thread https://github.com/notifications/unsubscribe/AOahjTNZEZ_ho88PjMl9lfNkH180X57Rks5qT76xgaJpZM4JIMHe .

ck37 commented 8 years ago

Ok I tried a quick test case and it does seem to work without error:

library(doParallel)
cl <- makeCluster(4)
registerDoParallel(cl)
data(BreastCancer, package="mlbench")
data = BreastCancer
data$Y = as.numeric(data$Class == "malignant")
vim = varImpact(Y = data$Y, data = subset(data, select=-c(Y, Class, Id)))
vim
ck37 commented 7 years ago

Closing this old varImpact issue (feel free to re-open though!)