Error in unserialize(node$con) : error reading from connection Calls: system.time ... FUN -> recvData -> recvData.SOCKnode -> unserialize Execution halted

mmrinconh commented 8 years ago

Hi, I got this error at this point ttot<-system.time(chainspre<-autojags(data=Data,inits=Inits, parameters.to.save=monitor, model.file="modmodela.txt", n.chains=3, n.adapt=10000, iter.increment=30000, n.burnin=10000, n.thin=20, save.all.iter=FALSE, modules=c( 'glm','mix','bugs','lecuyer' ), parallel=TRUE, DIC=TRUE, store.data=FALSE, codaOnly=FALSE,seed=floor(runif(1,1,10000)), bugs.format=FALSE, Rhat.limit=1.1, max.iter=200000, verbose=TRUE))

Processing function input.......

Done.

Burn-in + Update 1 (40000)Timing stopped at: 0.019 0.021 3665.864

In forums they say it's about memory, because one worker see a process near to exceed memory and kills it, but when I check memory it is ok, maybe this error have happened to someone before ans someone has a clue about how to solve it.

Thanks for your help!

kenkellner commented 8 years ago

Hi,

I haven't run into this problem. Given that it is happening in the middle of a JAGS update (and not during pre or post-processing) I don't think it's an issue with jagsUI. How are you confirming that memory isn't the issue? Have you tried running the analysis with parallel=FALSE? Also note in the new version of the package I just released you can manually set the number of cores used - you might try setting n.cores to 2 . The analysis will take longer but maybe you can avoid the error.

mmrinconh commented 8 years ago

Thanks for the prompt answer! The problem I thing is in the function clusterApply inside run.parallel function (Here says that parallel consumes a lot of memory and a situation with the same error is described http://gforge.se/2015/02/how-to-go-parallel-in-r-basics-tips/#Memory_load). I'm working on a cluster and it reports about memory consumed, I have available 5.2G and it says: maxvmem 1.240G. With parallel=FALSE it works fine. I tried n.cores=3 and 3 chains, n.cores=2 and 2 chains but it doesn't work, I got the same error...:(

beausoleilmo commented 8 years ago

I have a similar message and don't know how to deal with it:

 Error in unserialize(node$con) : error reading from connection 
12 unserialize(node$con) 
11 recvData.SOCKnode(con) 
10 recvData(con) 
9 FUN(X[[i]], ...) 
8 lapply(cl, recvResult) 
7 checkForRemoteErrors(lapply(cl, recvResult)) 
6 clusterCall(cl, .runjags) 
5 tryCatchList(expr, classes, parentenv, handlers) 
4 tryCatch(res <- clusterCall(cl, .runjags), finally = stopCluster(cl)) 
3 jags.parallel(data = d$data, inits = d$inits, parameters.to.save = d$params, 
    model.file = model.jags, n.chains = nc, n.thin = nt, n.iter = ni, 
    n.burnin = nb, working.directory = NULL, n.cluster = n.cluster) at model_sim_simple.R#217
2 run.model(d = dd, ni = (ni) * scale, nt = nt, nb = nb * scale, 
    nc = nc, n.cluster = n.cluster) at analyses_sim_simple.R#97
1 analyse(d = data, scale = 1, ni = c(1000 + 5000), nt = 15, nb = 5000, 
    nc = 3, n.cluster = 3, save.directory = "~/Desktop/large saved file/", 
    file.name = paste(name.file), diagnostic.graph = FALSE, init = list(mu.phi.sp = c(-0.5)))

Is there a solution to this kind of problem? I read that it could be a problem with the parallel

mmrinconh commented 8 years ago

I coudn't solve it directly, unfortunately instead, I used the following approach without using the facilities of jagsUI package:

 require(rjags)  
 require(runjags)                                                      
require(doMC); 
registerDoMC()
require("doParallel")
registerDoParallel(cores=2)
load.module("glm"); load.module("lecuyer"); load.module("mix");load.module("bugs");

samples<-1000#1000         # final number samples per chain
thin<-100        # amount of thinning to apply
n.adapt<-10000#2000
burn.in.prop<-0.1
Inits<-function()
{
list(.RNG.name="lecuyer::RngStream", .RNG.seed=sample(1:1e+06,1),  par=c(p1[1],p2[1],p3[1]),par3=c(0,0,0),qEta=0.001,li=0.7,CVR=1.5  )
}

system.time(resultList<-foreach(i=1:2, .combine="c")%dopar%{
jm2<-jags.model("modmodela.txt",data=Data,inits=Inits,n.chain=1,n.adapt=n.adapt)
##Run the model, record the time
chainspre<-coda.samples(jm2,monitor,n.iter=samples*thin,thin=thin)
return(chainspre)
})
chainsp<-c(resultList[])
class(chainsp)="mcmc.list"

beausoleilmo commented 8 years ago

Thanks for your answer! I found the error. It a memory load problem (see at the end of the page) for the Error in unserialize(node$con) : error reading from connection. If you track too many parameters, you'll get this error message.

Using FORK can be another solution.

mmrinconh commented 8 years ago

Thanks!, I will try!

kenkellner commented 8 years ago

Thanks @beausoleilmo. I have also found that this error seems to occur when R runs out of memory to track a large number of parameters. Another potential solution is to use the jags.basic() function in jagsUI which still allows for running chains in parallel but does not calculate output summaries (and thus might save some space in RAM). I haven't tested this, though.

kenkellner / jagsUI

Error in unserialize(node$con) : error reading from connection Calls: system.time ... FUN -> recvData -> recvData.SOCKnode -> unserialize Execution halted #8