bob-carpenter commented 9 years ago

From @GCMeyers on stan-dev/stan: https://github.com/stan-dev/stan/issues/1654

I just upgraded to rstan v 2.8 from v 2.6. An analysis that I am doing calls stan in a loop. For a low number of calls, everything works. For a higher number of calls, errors happen. These errors did not happen when I was using 2.6.

[ ] add a unit test that fails in the current version of RStan
[ ] fix this resource allocation bug so that it pasts the test

maverickg commented 9 years ago

duplicate of https://github.com/stan-dev/rstan/issues/219?

Otherwise, we need code to replicate the error.

GGMeyers commented 9 years ago

This appears to be the same issue. But the workaround using "showAllConnections()" seemed to cause other problems. Below I will show first - the part of the script that induces the error, and second - the error messages am getting. I use Stan 2.8 and R 3.2. If necessary I can send the data and the entire script.

++++ Part of script that triggers the error library(doParallel) cl <- makeCluster(4) registerDoParallel(cl) rho.mcmc=foreach (k=1:num.mcmc,.combine=rbind) %dopar%{ library(rstan) data.for.stan=list(logprem1 = log(premium1[1:10]), logprem2 = log(premium2[1:10]), X = X, rhomax = rhomax, w = rdata1$w, d = rdata1$d, alpha1 = alpha1[k,], alpha2 = alpha2[k,], beta1 = beta1[k,], beta2 = beta2[k,], logelr1 = logelr1[k], logelr2 = logelr2[k], gamma1 = gamma1[k], gamma2 = gamma2[k], delta1 = delta1[k], delta2 = delta2[k], sig_1 = sig_1[k,], sig_2 = sig_2[k,]) fit1 = stan(fit=fit0,data=data.for.stan,chains=1,iter=100, seed=12345,init=initB,verbose=F) #

extract information from jags output to process in R

# rho=extract(fit1,"rho")$rho[50] #

Rhat is commmented out after experimenting as it doubles the run time

Rhat=as.matrix(summary(fit1)$summary)[1,10]

rho.mcmc.out=c(rho,Rhat)

closeAllConnections() }

++++ error messages Loading required package: Rcpp Loading required package: ggplot2 rstan (Version 2.8.0, packaged: 2015-09-19 14:48:38 UTC, GitRev: 05c3d0058b6a) For execution on a local, multicore CPU with excess RAM we recommend calling rstan_options(auto_write = TRUE) options(mc.cores = parallel::detectCores()) Loading required package: foreach foreach: simple, scalable parallel programming from Revolution Analytics Use Revolution R for scalability, fault tolerance and more. http://www.revolutionanalytics.com Loading required package: iterators the number of chains is less than 1; sampling not done [1] "Maximum Rhat = 1.0038 Mean lp = 61.86 Thin = 1" [1] "Maximum Rhat = 1.0039 Mean lp = 99.97 Thin = 1" the number of chains is less than 1; sampling not done Error in unserialize(socklist[[n]]) : error reading from connection Error in summary.connection(connection) : invalid connection Calls: ... sendData.SOCKnode -> serialize -> summary -> summary.connection Execution halted Error in summary.connection(connection) : invalid connection Calls: ... sendData.SOCKnode -> serialize -> summary -> summary.connection Execution halted Error in summary.connection(connection) : invalid connection Calls: ... sendData.SOCKnode -> serialize -> summary -> summary.connection Execution halted Error in summary.connection(connection) : invalid connection Calls: ... sendData.SOCKnode -> serialize -> summary -> summary.connection Execution halted

showConnections() description class mode text isopen can read can write 3 "<-localhost:11663" "sockconn" "a+b" "binary" "opened" "yes" "yes"
4 "<-localhost:11663" "sockconn" "a+b" "binary" "opened" "yes" "yes"
5 "<-localhost:11663" "sockconn" "a+b" "binary" "opened" "yes" "yes"
6 "<-localhost:11663" "sockconn" "a+b" "binary" "opened" "yes" "yes"

sakrejda commented 9 years ago

The closeAllConnections command is also deleting the connections used to communicate with the parallel execution processes. The solution is to collect only rstan's stray connections and close them explicitly. Is there something special about your setup that generates this error or could you produce a self contained example that exhibits this?

GGMeyers commented 9 years ago

Before sending the entire code and data files, is there some way to identify the “stray connections?"

Glenn

On Oct 30, 2015, at 10:12 AM, Krzysztof Sakrejda notifications@github.com wrote:

The closeAllConnections command is also deleting the connections used to communicate with the parallel execution processes. The solution is to collect only rstan's stray connections and close them explicitly. Is there something special about your setup that generates this error or could you produce a self contained example that exhibits this?

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstan/issues/225#issuecomment-152536990.

sakrejda commented 9 years ago

I think calling showConnections() returns a table that should have enough info to differentiate which ones are from Stan.

GGMeyers commented 9 years ago

Here is what I get after the the program has bombed. Which ones are from stan?

showConnections() description class mode text isopen can read can write 3 "<-localhost:11663" "sockconn" "a+b" "binary" "opened" "yes" "yes"
4 "<-localhost:11663" "sockconn" "a+b" "binary" "opened" "yes" "yes"
5 "<-localhost:11663" "sockconn" "a+b" "binary" "opened" "yes" "yes"
6 "<-localhost:11663" "sockconn" "a+b" "binary" "opened" "yes" "yes"

Glenn

On Oct 30, 2015, at 10:34 AM, Krzysztof Sakrejda notifications@github.com wrote:

I think calling showConnections() returns a table that should have enough info to differentiate which ones are from Stan.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstan/issues/225#issuecomment-152542133.

sakrejda commented 9 years ago

Good question, is that the output of showCommections() for the script bombing with closeAllConnections() or without? It looks like you did closeAllConnections() within the foreach loop so it won't show rstan's connections.. (?) If that's the case try showConnections after the script bombs without closeAllConnections()

GGMeyers commented 9 years ago

Here what I got after running the loop with only 4 iterations and the program ended normally.

showConnections() description class mode text isopen can read can write

Glenn

On Oct 30, 2015, at 10:53 AM, Krzysztof Sakrejda notifications@github.com wrote:

Good question, is that the output of showCommections() for the script bombing with closeAllConnections() or without? It looks like you did closeAllConnections() within the foreach loop so it won't show rstan's connections.. (?) If that's the case try showConnections after the script bombs without closeAllConnections()

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstan/issues/225#issuecomment-152546783.

GGMeyers commented 9 years ago

The below was sent prematurely. I wanted also to include what happen without the closeAllConnections() and the program bombed with 10000 iterations.

So what was created by stan?

Also, If I know, how do I close it. In looking at the R help I get

close(con, ...)

S3 method for class 'connection'

close(con, type = "rw", ...) What is con, or more precisely how do I refer to the connection I want to close.

+++++++++

Error in { : task 487 failed - "all connections are in use"

showConnections() description class mode text isopen can read can write 3 "<-localhost:11994" "sockconn" "a+b" "binary" "opened" "yes" "yes"
4 "<-localhost:11994" "sockconn" "a+b" "binary" "opened" "yes" "yes"
5 "<-localhost:11994" "sockconn" "a+b" "binary" "opened" "yes" "yes"
6 "<-localhost:11994" "sockconn" "a+b" "binary" "opened" "yes" "yes"

Glenn

On Oct 30, 2015, at 11:29 AM, Glenn Meyers ggmeyers@metrocast.net wrote:

Here what I got after running the loop with only 4 iterations and the program ended normally.

showConnections() description class mode text isopen can read can write

Glenn

On Oct 30, 2015, at 10:53 AM, Krzysztof Sakrejda <notifications@github.com mailto:notifications@github.com> wrote:

Good question, is that the output of showCommections() for the script bombing with closeAllConnections() or without? It looks like you did closeAllConnections() within the foreach loop so it won't show rstan's connections.. (?) If that's the case try showConnections after the script bombs without closeAllConnections()

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstan/issues/225#issuecomment-152546783.

GGMeyers commented 9 years ago

After searching the various help pages in R, I got the fix I needed to get my script to run. What worked was inserting the following into the “foreach” loop.

ncon=dim(showConnections())[1] for (i in 7:ncon){ if(ncon>6){close.connection(getConnection(i))} }

Can I assume that this workaround will not be needed in future release of rstan and/or R?

Glenn

On Oct 30, 2015, at 10:53 AM, Krzysztof Sakrejda notifications@github.com wrote:

Good question, is that the output of showCommections() for the script bombing with closeAllConnections() or without? It looks like you did closeAllConnections() within the foreach loop so it won't show rstan's connections.. (?) If that's the case try showConnections after the script bombs without closeAllConnections()

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstan/issues/225#issuecomment-152546783.

bgoodri commented 9 years ago

Yes

sakrejda commented 9 years ago

@GGMeyers, thanks for putting up a functional work-around.

GGMeyers commented 9 years ago

You provided the crucial hint that I had to close only those connections that were open by stan. I had to figure out what a "connection" was and how to identify which ones to close. To others who use this workaround - I had two files and 4 cores connected. Other scripts may have a different number of connections open.

bgoodri commented 9 years ago

If you install the next rstan from GitHub by executing in R

devtools::install_github("stan-dev/rstan", ref = "develop", subdir = "rstan/rstan")

then you won't need this workaround. You can ignore the fact that it will say at the end

Warning message:
Github repo contains submodules, may not function as expected!

GGMeyers commented 9 years ago

I tried it. The code that I have been running did not generate the connection error. Instead if just ran through the first sections as usual, and then froze when it got to the foreach loop that contained stan. I reinstalled 2.8 and with the workaround, it completed normally in about 6 minutes.

Glenn

On Oct 31, 2015, at 10:11 AM, bgoodri notifications@github.com wrote:

If you install the next rstan from GitHub by executing in R

devtools::install_github("stan-dev/rstan", ref = "develop", subdir = "rstan/rstan") then you won't need this workaround. You can ignore the fact that it will say at the end

Warning message: Github repo contains submodules, may not function as expected! — Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstan/issues/225#issuecomment-152739031.

maverickg commented 9 years ago

@GGMeyers please share the code to replicate the error you described.

GGMeyers commented 9 years ago

Below is the code that should replicate the error. With the workaround it completes without error on Stan 2.8.0. I pasted it below as I am unsure what GitHub does with attached files.

Some comments. I commented out the workaround on lines 418-424 I believe the error occurs in the foreach loop that is between lines 390 to 433. The code calls two datasets. There is a link to the datasets on line 13. Or if you want I can send the datasets to you (about 1 mb each).

At a very high level the code first runs Stan on models with each of the two datasets producing samples of size 10000. Then using each of the 10000 samples it runs a one-parameter Stan model that attempts to capture the dependency (or lack thereof) between the first two models.

I notice that the model in 2.8 runs noticeably faster in 2.8 that in did in 2.6.

the code ++++++++++++++++++

Script to run the bivariate CSR model

by Glenn Meyers

rm(list = ls()) # clear workspace") t0=Sys.time() #

user inputs

# grpcode="620" outfilename=paste("CSR Model2Step",grpcode,".csv") setwd("~/Dropbox/Dependencies in SLR Models") #

source of insurer data

http://www.casact.org/research/index.cfm?fa=loss_reserves_data

# insurer.data1="~/Dropbox/CAS Loss Reserve Database/comauto_pos.csv" insurer.data2="~/Dropbox/CAS Loss Reserve Database/ppauto_pos.csv" library(mvtnorm) library(rstan) rstan_options(auto_write = TRUE) options(mc.cores = parallel::detectCores()) library(parallel) library(doParallel) #

function to get Schedule P triangle data given ins group and line of business

# ins.line.data=function(g.code){ b=subset(a,a$GRCODE==g.code) name=b$GRNAME grpcode=b$GRCODE w=b$AccidentYear d=b$DevelopmentLag cum_incloss=b[,6] cum_pdloss=b[,7] bulk_loss=b[,8] dir_premium=b[,9] ced_premium=b[,10] net_premium=b[,11] single=b[,12] posted_reserve97=b[,13]

get incremental paid losses - assume data is sorted by ay and lag

inc_pdloss=numeric(0) for (i in unique(w)){ s=(w==i) pl=c(0,cum_pdloss[s]) ndev=length(pl)-1 il=rep(0,ndev) for (j in 1:ndev){ il[j]=pl[j+1]-pl[j] } inc_pdloss=c(inc_pdloss,il) } data.out=data.frame(grpcode,w,d,net_premium,dir_premium,ced_premium, cum_pdloss,cum_incloss,bulk_loss,inc_pdloss,single,posted_reserve97) return(data.out) } #

read and aggregate the insurer data and

set up training and test data frames

for line 1

# a=read.csv(insurer.data1) cdata=ins.line.data(grpcode) w=cdata$w-1987 d=cdata$d #

sort the data in order of d, then w within d

# o1=100*d+w o=order(o1) w=w[o] d=d[o] premium=cdata$net_premium[o] cpdloss=cdata$cum_pdloss[o] cpdloss=pmax(cpdloss,1) adata1=data.frame(grpcode,w,d,premium,cpdloss) rdata1=subset(adata1,(adata1$w+adata1$d)<12) numw=length(unique(rdata1$w)) aloss1=adata1$cpdloss rloss1=rdata1$cpdloss premium1=rdata1$premium #

read and aggregate the insurer data and

set up training and test data frames

for line 2

# a=read.csv(insurer.data2) cdata=ins.line.data(grpcode) w=cdata$w-1987 d=cdata$d #

sort the data in order of d, then w within d

# rhomax=0.9 o1=100*d+w o=order(o1) w=w[o] d=d[o] premium=cdata$net_premium[o] cpdloss=cdata$cum_pdloss[o] cpdloss=pmax(cpdloss,1) adata2=data.frame(grpcode,w,d,premium,cpdloss) rdata2=subset(adata2,(adata2$w+adata2$d)<12) aloss2=adata2$cpdloss rloss2=rdata2$cpdloss premium2=rdata2$premium #

univariate models

# #

Stan script for univariate models

# scodeU = " data{ real logprem[10]; real logloss[55]; int w[55]; int d[55]; } parameters{ real r_alpha[9]; real r_beta[9]; real logelr; real a[10]; real gamma; real delta; } transformed parameters{ real alpha[10]; real beta[10]; real speedup[10]; real sig2[10]; real sig[10]; real mu[55]; alpha[1]<-0; for (i in 2:10) alpha[i] <- r_alpha[i-1]; for (i in 1:9) beta[i] <- 10_rbeta[i]-5; beta[10] <- 0; speedup[1] <- 1; for (i in 2:10) speedup[i] <- speedup[i-1](1-gamma-(i-2)_delta); sig2[10] <- a[10]; for (i in 1:9) sig2[10-i] <- sig2[11-i]+a[i]; for (i in 1:10) sig[i] <- sqrt(sig2[i]); for (i in 1:55){ mu[i] <- logprem[w[i]]+logelr+alpha[w[i]]+beta[d[i]]_speedup[w[i]]; } } model { for (i in 1:9) r_alpha[i] ~ normal(0,3.162); for (i in 1:9) r_beta[i] ~ uniform(0,1); logelr ~ uniform(-1.5,0.5); gamma ~ normal(0,0.05); delta ~ normal(0,0.01); for (i in 1:55) logloss[i] ~ normal(mu[i],sig[d[i]]); } " #

initialization function for scodeU

# initU=function(chain_id){ set.seed(12345+chain_id) list(r_alpha=rnorm(9,0,0.2),r_beta=runif(9),a=runif(10), logelr=runif(1,-0.75,-0.5),gamma=rnorm(1,0,0.1), delta=rnorm(1,0,0.02)) }

univariate line 1

# #

data for univariate 1

# data.u1=list(logprem = log(rdata1$premium[1:10]), logloss = log(rloss1), w = rdata1$w, d = rdata1$d) #

compile the univariate model

# fitU = stan(model_code=scodeU,data=data.u1,seed=12345,init=initU,chains=0)

fitU.debug = stan(fit=fitU,data=data.u1,init=initU,chains=4,

seed=12345,iter=100,verbose=F)

#

run the univariate model for line 1

# stan_thin=1 stan_iter=5000 Rhat_target=1.05 max_Rhat=2 while ((max_Rhat > Rhat_target)&(stan_thin<65)){ sflist <- mclapply(1:4, mc.cores = 4, function(i) stan(fit = fitU, data = data.u1,init=initU, seed = 12345,iter=stan_iter,thin=stan_thin, chains = 1, chain_id = i, cores=4)) fitU1=sflist2stanfit(sflist) fitU1_summary=as.matrix(summary(fitU1)$summary)[,c(1,3,10)] mrh=subset(fitU1_summary,is.na(fitU1_summary[,3])==F) max_Rhat=round(max(mrh[,3]),4) mean_lp=round(fitU1_summary[dim(fitU1_summary)[1],1],2) print(paste("Maximum Rhat =",max_Rhat," Mean lp__ =",mean_lp, "Thin =",stan_thin)) stan_thin=2_stan_thin stan_iter=2_stan_iter }

print(fitU1_summary)

#

optional diagnostics

#

plot(fitU1)

traceplot(fitU1)

#

extract information from stan output to process in R

# b1=extract(fitU1,c("alpha","beta","gamma","delta","logelr","sig")) alpha1=b1$alpha beta1=b1$beta gamma1=b1$gamma delta1=b1$delta logelr1=b1$logelr sig_1=b1$sig # # set.seed(12345) cl <- makePSOCKcluster(4) registerDoParallel(cl) at1.wd10=foreach (k=1:length(gamma1),.combine=rbind) %dopar%{ atv=rep(0,10) for (w in 1:10){ atv[w]=rlnorm(1,log(premium1[w])+logelr1[k]+alpha1[k,w],sig_1[k,10]) } at=atv } stopCluster(cl) #

obtain outcomes

# outcomes1=rowSums(at1.wd10)

univariate line 2

# #

data for univariate line 2

# data.u2=list(logprem = log(rdata2$premium[1:10]), logloss = log(rloss2), w = rdata2$w, d = rdata2$d)# #

run the univariate model for line 2

# stan_thin=1 stan_iter=5000 Rhat_target=1.05 max_Rhat=2 while ((max_Rhat > Rhat_target)&(stan_thin<65)){ sflist <- mclapply(1:4, mc.cores = 4, function(i) stan(fit = fitU, data = data.u2,init=initU, seed = 12345,iter=stan_iter,thin=stan_thin, chains = 1, chain_id = i, cores=4)) fitU2=sflist2stanfit(sflist) fitU2_summary=as.matrix(summary(fitU2)$summary)[,c(1,3,10)] mrh=subset(fitU2_summary,is.na(fitU2_summary[,3])==F) max_Rhat=round(max(mrh[,3]),4) mean_lp=round(fitU2_summary[dim(fitU2_summary)[1],1],2) print(paste("Maximum Rhat =",max_Rhat," Mean lp__ =",mean_lp, "Thin =",stan_thin)) stan_thin=2_stan_thin stan_iter=2_stan_iter }

print(fitU2_summary)

#

optional diagnostics

#

plot(fitU1)

traceplot(fitU1)

#

extract information from stan output to process in R

# b2=extract(fitU2,c("alpha","beta","gamma","delta","logelr","sig")) alpha2=b2$alpha beta2=b2$beta gamma2=b2$gamma delta2=b2$delta logelr2=b2$logelr sig_2=b2$sig #

simulate loss statistics by accident year

# set.seed(12345) cl <- makePSOCKcluster(4) registerDoParallel(cl) at2.wd10=foreach (k=1:length(gamma2),.combine=rbind) %dopar%{ atv=rep(0,10) for (w in 1:10){ atv[w]=rlnorm(1,log(premium2[w])+logelr2[k]+alpha2[k,w],sig_2[k,10]) } at=atv } stopCluster(cl) #

obtain outcomes

# outcomes2=rowSums(at2.wd10)

#

Stan Code

# scode=" data { int w[55]; int d[55]; real rhomax; vector[2] X[55]; real logprem1[10]; real logprem2[10]; real logelr1; real logelr2; real alpha1[10]; real alpha2[10]; real beta1[10]; real beta2[10]; real gamma1; real gamma2; real delta1; real delta2; real sig_1[10]; real sig2[10]; } transformed data{ vector[2] Mu[55]; real speedup1[10]; real speedup2[10]; speedup1[1]<-1; speedup2[1]<-1; for (i in 2:10){ speedup1[i]<-speedup1[i-1](1-gamma1-(i-2)delta1); speedup2[i]<-speedup2[i-1](1-gamma2-(i-2)_delta2); } for (i in 1:55){ Mu[i][1]<-logprem1[w[i]]+logelr1+alpha1[w[i]]+beta1[d[i]]_speedup1[w[i]]; Mu[i][2]<-logprem2[w[i]]+logelr2+alpha2[w[i]]+beta2[d[i]]_speedup2[w[i]]; } } parameters { real r_rho; } transformed parameters { matrix[2,2] Sigma[55]; real rho; rho<-2_rhomax_r_rho-rhomax; for (i in 1:55){ Sigma[i][1,1]<-sig_1[d[i]]^2; Sigma[i][1,2]<-rho_sig_1[d[i]]_sig_2[d[i]]; Sigma[i][2,1]<-Sigma[i][1,2]; Sigma[i][2,2]<-sig_2[d[i]]^2; } } model { r_rho ~ beta(2,2); for (i in 1:55){ X[i] ~ multi_normal(Mu[i],Sigma[i]); } } " initB=list(list(r_rho=0.5)) X=cbind(log(rloss1),log(rloss2)) #

compile the bivariate model

# k=1 data.for.stan=list(logprem1 = log(premium1[1:10]), logprem2 = log(premium2[1:10]), X = X, rhomax = rhomax, w = rdata1$w, d = rdata1$d, alpha1 = alpha1[k,], alpha2 = alpha2[k,], beta1 = beta1[k,], beta2 = beta2[k,], logelr1 = logelr1[k], logelr2 = logelr2[k], gamma1 = gamma1[k], gamma2 = gamma2[k], delta1 = delta1[k], delta2 = delta2[k], sig_1 = sig_1[k,], sig_2 = sig_2[k,]) fit0 = stan(model_code=scode,data=data.for.stan,chains=0,seed=12345,init=initB)

fit_debug = stan(fit=fit0,data=data.for.stan,chains=1,iter=10,verbose=F)

num.mcmc=length(b1$gamma) #

set up parallel processing

# library(doParallel) cl <- makeCluster(4) registerDoParallel(cl) rho.mcmc=foreach (k=1:num.mcmc,.combine=rbind) %dopar%{ library(rstan) data.for.stan=list(logprem1 = log(premium1[1:10]), logprem2 = log(premium2[1:10]), X = X, rhomax = rhomax, w = rdata1$w, d = rdata1$d, alpha1 = alpha1[k,], alpha2 = alpha2[k,], beta1 = beta1[k,], beta2 = beta2[k,], logelr1 = logelr1[k], logelr2 = logelr2[k], gamma1 = gamma1[k], gamma2 = gamma2[k], delta1 = delta1[k], delta2 = delta2[k], sig_1 = sig_1[k,], sig_2 = sig_2[k,]) fit1 = stan(fit=fit0,data=data.for.stan,chains=1,iter=100, seed=12345,init=initB,verbose=F) #

workaround for connection bugs in stan 2.8.0

#

ncon=dim(showConnections())[1]

for (i in 7:ncon){

if(ncon>6){close.connection(getConnection(i))}

}

#

extract information from MCMC output to process in R

# rho=extract(fit1,"rho")$rho[50] #

Rhat is commmented out after experimenting as it doubles the run time

Rhat=as.matrix(summary(fit1)$summary)[1,10]

rho.mcmc.out=c(rho,Rhat)

} #

get the predictive distribution for line1, line2 and line1+line2

# at.wd10=foreach (k=1:num.mcmc,.combine=rbind) %dopar%{ library(mvtnorm) atu1=rep(0,10) atu2=rep(0,10) Mu=rep(0,2) Sigma=matrix(0,2,2) Sigma[1,1]=sig_1[k,10]^2 Sigma[2,1]=rho.mcmc[k]_sig_1[k,10]_sig_2[k,10] Sigma[1,2]=Sigma[2,1] Sigma[2,2]=sig_2[k,10]^2 for (w in 1:10){ Mu=c(log(premium1[w])+logelr1[k]+alpha1[k,w], log(premium2[w])+logelr2[k]+alpha2[k,w]) lloss=rmvnorm(1,Mu,Sigma) atu1[w]=exp(lloss[1]) atu2[w]=exp(lloss[2]) } at=c(atu1,atu2) } stopCluster(cl) # atu1.wd10=at.wd10[,1:10] atu2.wd10=at.wd10[,11:20] #

calculate loss statistics and output to data frame

# Premium1=subset(rdata1,rdata1$d==1)$premium ssu1.wd10=rep(0,10) msu1.wd10=rep(0,10) Premium2=subset(rdata2,rdata2$d==1)$premium ssu2.wd10=rep(0,10) msu2.wd10=rep(0,10) PremiumC=Premium1+Premium2 ssC.wd10=rep(0,10) msC.wd10=rep(0,10) # msu1.wd10[1]=mean(atu1.wd10[,1]) for (w in 2:10){ msu1.wd10[w]=mean(atu1.wd10[,w]) ssu1.wd10[w]=sd(atu1.wd10[,w]) } Pred.CSR1=rowSums(atu1.wd10) msu1.td10=mean(Pred.CSR1) ssu1.td10=sd(Pred.CSR1) CSR1.Estimate=round(msu1.wd10) CSR1.SE=round(ssu1.wd10) CSR1.CV=round(CSR1.SE/CSR1.Estimate,4) act1=sum(subset(aloss1,adata1$d==10)[1:10]) pct.CSR1=sum(Pred.CSR1<=act1)/length(Pred.CSR1)*100 #

put CSR accident year statistics into a data frame

# CSR1.Estimate=c(CSR1.Estimate,round(msu1.td10)) CSR1.SE=c(CSR1.SE,round(ssu1.td10)) CSR1.CV=c(CSR1.CV,round(ssu1.td10/msu1.td10,4)) Premium1=c(Premium1,sum(Premium1)) Outcome1=subset(aloss1,adata1$d==10) Outcome1=c(Outcome1,sum(Outcome1)) Group=rep(grpcode,11) CSR1.Pct=c(rep(NA,10),pct.CSR1) # msu2.wd10[1]=mean(atu2.wd10[,1]) for (w in 2:10){ msu2.wd10[w]=mean(atu2.wd10[,w]) ssu2.wd10[w]=sd(atu2.wd10[,w]) } Pred.CSR2=rowSums(atu2.wd10) msu2.td10=mean(Pred.CSR2) ssu2.td10=sd(Pred.CSR2) CSR2.Estimate=round(msu2.wd10) CSR2.SE=round(ssu2.wd10) CSR2.CV=round(CSR2.SE/CSR2.Estimate,4) act2=sum(subset(aloss2,adata2$d==10)[1:10]) pct.CSR2=sum(Pred.CSR2<=act2)/length(Pred.CSR2)*100 #

put CSR accident year statistics into a data frame

# W=c(1:10,"Total") CSR2.Estimate=c(CSR2.Estimate,round(msu2.td10)) CSR2.SE=c(CSR2.SE,round(ssu2.td10)) CSR2.CV=c(CSR2.CV,round(ssu2.td10/msu2.td10,4)) Premium2=c(Premium2,sum(Premium2)) Outcome2=subset(aloss2,adata2$d==10) Outcome2=c(Outcome2,sum(Outcome2)) Group=rep(grpcode,11) CSR2.Pct=c(rep(NA,10),pct.CSR2) # atC.wd10=atu1.wd10+atu2.wd10 msC.wd10[1]=mean(atC.wd10[,1]) for (w in 2:10){ msC.wd10[w]=mean(atC.wd10[,w]) ssC.wd10[w]=sd(atC.wd10[,w]) } Pred.CSRC=rowSums(atC.wd10) msC.td10=mean(Pred.CSRC) ssC.td10=sd(Pred.CSRC) CSRC.Estimate=round(msC.wd10) CSRC.SE=round(ssC.wd10) actC=act1+act2 pct.CSRC=sum(Pred.CSRC<=actC)/length(Pred.CSRC)*100 W=c(1:10,"Total") CSRC.Estimate=c(CSRC.Estimate,round(msC.td10)) CSRC.SE=c(CSRC.SE,round(ssC.td10)) CSRC.CV=round(CSRC.SE/CSRC.Estimate,4) PremiumC=c(PremiumC,sum(PremiumC)) OutcomeC=Outcome1+Outcome2 Group=rep(grpcode,11) CSRC.Pct=c(rep(NA,10),pct.CSRC) risk=data.frame(W, Premium1,CSR1.Estimate,CSR1.SE,CSR1.CV,Outcome1,CSR1.Pct, Premium2,CSR2.Estimate,CSR2.SE,CSR2.CV,Outcome2,CSR2.Pct, PremiumC,CSRC.Estimate,CSRC.SE,CSRC.CV,OutcomeC,CSRC.Pct) print(risk) write.csv(risk,file=outfilename) #

plot items of interest

# par(mfrow=c(1,1)) hist(rho.mcmc,main=expression(paste("Posterior Distribution of ",rho)), xlab=expression(rho), sub=paste("Posterior Mean = ",round(mean(rho.mcmc),3))) # xrange=range(Pred.CSR1,Pred.CSR2,Pred.CSRC) par(mfrow=c(3,1)) hist(Pred.CSR1,main="Posterior Distribution of Outcomes", xlim=xrange,breaks=100,xlab="Univariate Line 1") hist(Pred.CSR2,main="", xlim=xrange,breaks=100,xlab="Univariate Line 2") hist(Pred.CSRC,main="", xlim=xrange,breaks=100,xlab="Bivariate Line 1 + Line 2") # t1=Sys.time() print(t1-t0) #

get independent sum

# par(mfrow=c(1,1)) Pred.CSRI=outcomes1+outcomes2 plot(sort(Pred.CSRC),sort(Pred.CSRI)) abline(0,1) par(mfrow=c(2,1)) hist(Pred.CSRC,sub=paste("Mean =",round(mean(Pred.CSRC)), "S.D. =",round(sd(Pred.CSRC)))) hist(Pred.CSRI,sub=paste("Mean =",round(mean(Pred.CSRI)), "S.D. =",round(sd(Pred.CSRI))))

Glenn

On Oct 31, 2015, at 10:02 PM, maverickg notifications@github.com wrote:

@GGMeyers https://github.com/GGMeyers please share the code to replicate the error you described.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstan/issues/225#issuecomment-152786764.

maverickg commented 9 years ago

@GGMeyers thanks for the code. But I run it without error using the current dev branch. If you still have this issue, can you let us know how do you use R (say, use RStudio or not) and which platform you are using (mac, windows, or linux)? Also please provide the message printed right after rstan is loaded.

GGMeyers commented 9 years ago

Hmm,

I use RStudio. I ran the same code that I sent you using Stan 2.8.1, being careful to run it immediately after opening RStudio. It did run normally, in fact a bit faster than it ran with 2.8.0 with the workaround, which was noticeably faster than 2.6.

When I ran it on 2.8.1 before I let it run for over 20 minutes (it now runs in about 4 minutes) before I cut it off. I have no recollection of what else might have happened in the system before I initiated the run.

Anyway thanks for your prompt attention to the matter and I apologize for the false alarm.

Glenn

On Nov 1, 2015, at 4:24 PM, maverickg notifications@github.com wrote:

@GGMeyers https://github.com/GGMeyers thanks for the code. But I run it without error understand the current dev branch. If you still have this issue, can you let us know how do you use R (say, use RStudio or not) and which platform you are using (mac, windows, or linux). Also please provide the message printed right after rstan is loaded.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstan/issues/225#issuecomment-152864973.

maverickg commented 9 years ago

Close this as it should be fixed.

stan-dev / rstan

resource exhaustion in repeated calls to stan() #225

extract information from jags output to process in R

Rhat is commmented out after experimenting as it doubles the run time

Rhat=as.matrix(summary(fit1)$summary)[1,10]

rho.mcmc.out=c(rho,Rhat)

S3 method for class 'connection'

Script to run the bivariate CSR model

by Glenn Meyers

user inputs

source of insurer data

http://www.casact.org/research/index.cfm?fa=loss_reserves_data

function to get Schedule P triangle data given ins group and line of business

get incremental paid losses - assume data is sorted by ay and lag

read and aggregate the insurer data and

set up training and test data frames

for line 1

sort the data in order of d, then w within d

read and aggregate the insurer data and

set up training and test data frames

for line 2

sort the data in order of d, then w within d

univariate models

Stan script for univariate models

initialization function for scodeU

univariate line 1

data for univariate 1

compile the univariate model

fitU.debug = stan(fit=fitU,data=data.u1,init=initU,chains=4,

seed=12345,iter=100,verbose=F)

run the univariate model for line 1

print(fitU1_summary)

optional diagnostics

plot(fitU1)

traceplot(fitU1)

extract information from stan output to process in R

obtain outcomes

univariate line 2

data for univariate line 2

run the univariate model for line 2

print(fitU2_summary)

optional diagnostics

plot(fitU1)

traceplot(fitU1)

extract information from stan output to process in R

simulate loss statistics by accident year

obtain outcomes

Stan Code

compile the bivariate model

fit_debug = stan(fit=fit0,data=data.for.stan,chains=1,iter=10,verbose=F)

set up parallel processing

workaround for connection bugs in stan 2.8.0

ncon=dim(showConnections())[1]

for (i in 7:ncon){

if(ncon>6){close.connection(getConnection(i))}

}

extract information from MCMC output to process in R

Rhat is commmented out after experimenting as it doubles the run time

Rhat=as.matrix(summary(fit1)$summary)[1,10]

rho.mcmc.out=c(rho,Rhat)

get the predictive distribution for line1, line2 and line1+line2

calculate loss statistics and output to data frame

put CSR accident year statistics into a data frame

put CSR accident year statistics into a data frame

plot items of interest

get independent sum