stan-dev / rstan

RStan, the R interface to Stan
https://mc-stan.org
1.04k stars 265 forks source link

####R Studio crashes when I rerun my stan model #392

Open GGMeyers opened 7 years ago

GGMeyers commented 7 years ago

Summary:

Please provide a short summary (no more than a sentence or two).

Description:

Describe the issue as clearly as possible.

Reproducible Steps:

Please report steps to reproduce the issue. If it's not possible to reproduce, please include a description of how you discovered the issue.

If you have a reproducible example, please include it.

Current Output:

If applicable, any relevant output from RStan.

Expected Output:

If applicable, the output you expected from RStan.

RStan Version:

The version of RStan you are running (e.g., from packageVersion("rstan"))

R Version:

The version of R you are running (e.g., from R.version.string)

Operating System:

Your operating system (e.g., OS X 10.11.3)

GGMeyers commented 7 years ago

More details below

The script produces output that appears to be right.

When I rerun the script in RStudio, R (or RStudio) crashes.

When I run my model I get the following warning.

In file included from file2b94c34477d.cpp:8: In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/StanHeaders/include/src/stan/model/model_header.hpp:4: In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/StanHeaders/include/stan/math.hpp:4: In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/StanHeaders/include/stan/math/rev/mat.hpp:4: In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/StanHeaders/include/stan/math/rev/core.hpp:12: In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/StanHeaders/include/stan/math/rev/core/gevv_vvv_vari.hpp:5: In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/StanHeaders/include/stan/math/rev/core/var.hpp:7: In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/BH/include/boost/math/tools/config.hpp:13: In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/BH/include/boost/config.hpp:39: /Library/Frameworks/R.framework/Versions/3.3/Resources/library/BH/include/boost/config/compiler/clang.hpp:196:11: warning: 'BOOST_NO_CXX11_RVALUE_REFERENCES' macro redefined [-Wmacro-redefined]

define BOOST_NO_CXX11_RVALUE_REFERENCES

      ^

:5:9: note: previous definition is here

define BOOST_NO_CXX11_RVALUE_REFERENCES 1

Here is the sessionInfo() output R version 3.3.2 (2016-10-31) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: macOS Sierra 10.12.3

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] parallel stats graphics grDevices utils datasets methods base

other attached packages: [1] doParallel_1.0.10 iterators_1.0.8 foreach_1.4.3
[4] loo_1.0.0 rstan_2.14.1 StanHeaders_2.14.0-1 [7] ggplot2_2.2.1

loaded via a namespace (and not attached): [1] Rcpp_0.12.9 codetools_0.2-15 matrixStats_0.51.0 assertthat_0.1
[5] grid_3.3.2 plyr_1.8.4 gtable_0.2.0 stats4_3.3.2
[9] scales_0.4.1 lazyeval_0.2.0 tools_3.3.2 munsell_0.4.3
[13] compiler_3.3.2 inline_0.3.14 colorspace_1.3-2 gridExtra_2.2.1
[17] tibble_1.2

Here is the code

rm(list = ls()) # clear workspace") t0=Sys.time() library(rstan) rstan_options(auto_write = TRUE) options(mc.cores = parallel::detectCores()) library(loo) library(parallel) library(doParallel) # setwd("~/Dropbox/Tyrell") a=read.csv("Dummy_FullData.csv") iseg=1 iID=1 #

function to get Schedule P triangle data given ins group and line of business

# ins.line.data=function(seg,id){ b=subset(a,a$ICS_Segment==segment[seg]) b=subset(b,b$ID==insurer_id[id]) Segment=rep(b$ICS_Segment[seg],55) ID=rep(b$ID[id],55) premium=NULL w=NULL d=NULL cpdloss=NULL for(dy in 1:10){ d=c(d,rep(dy,11-dy)) w=c(w,1:(11-dy)) premium=c(premium,b$Earned_Premium[1:(11-dy)]) cpdloss=c(cpdloss,b[1:(11-dy),7+dy]) } data.out=data.frame(ID,Segment,premium,w,d,cpdloss) return(data.out) } # segment=unique(a$ICS_Segment) print(segment) insurer_id=unique(a$ID) print(insurer_id) rdata=ins.line.data(iseg,iID) print(rdata) #

Stan script for the model

# scodeU = " data{ real logprem[10]; real logloss[55]; int w[55]; int d[55]; } parameters{ real tau; real r_alpha[9]; real r_beta[9]; real logelr; real a_ig[10]; real gamma; real delta; } transformed parameters{ real alpha[10]; real beta[10]; real speedup[10]; real sig2[10]; real sig[10]; real mu[55]; alpha[1] = 0; alpha[2] = r_alpha[1]; for (i in 3:10) alpha[i] = alpha[i-1]+r_alpha[i-1]; for (i in 1:9) beta[i] = 10r_beta[i]-5; beta[10] = 0; speedup[1] = 1; for (i in 2:10) speedup[i] = speedup[i-1](1-gamma-(i-2)delta); sig2[10] = gamma_cdf(1/a_ig[10],1,1); for (i in 1:9) sig2[10-i] = sig2[11-i]+gamma_cdf(1/a_ig[i],1,1); for (i in 1:10) sig[i] = sqrt(sig2[i]); for (i in 1:55){ mu[i] = logprem[w[i]]+logelr+alpha[w[i]]+beta[d[i]]speedup[w[i]]; } } model { tau ~ uniform(0,1); for (i in 1:9) r_alpha[i] ~ normal(0,tau); for (i in 1:9) r_beta[i] ~ uniform(0,1); for (i in 1:10) a_ig[i] ~ inv_gamma(1,1); logelr ~ uniform(-1.5,0.5); gamma ~ normal(0,0.05); delta ~ normal(0,0.01); for (i in 1:55) logloss[i] ~ normal(mu[i],sig[d[i]]); } generated quantities{ vector[55] log_lik; for (i in 1:55) log_lik[i] = normal_lpdf(logloss[i]|mu[i],sig[d[i]]); } "

#

initialization function for scodeU

# initU=function(chain_id){ set.seed(12345+chain_id) list(r_alpha=rnorm(9,0,0.2),r_beta=runif(9),a=runif(10), logelr=runif(1,-0.75,-0.5),gamma=rnorm(1,0,0.1), delta=rnorm(1,0,0.02)) } #

data for univariate 1

# data.u=list(logprem = log(rdata$premium[1:10]), logloss = log(rdata$cpdloss), w = rdata$w, d = rdata$d) #

compile the univariate model

# pars.list=c("alpha","tau","beta","gamma","delta","logelr","sig","log_lik","a_ig") fitU = stan(model_code=scodeU,data=data.u,seed=12345,init=initU,chains=0)

fitU.debug = stan(fit=fitU,data=data.u,init=initU,chains=1,

seed=12345,iter=100,verbose=F)

#

run the univariate model for line 1

# stan_thin=1 stan_iter=5000 Rhat_target=1.05 max_Rhat=2 while ((max_Rhat > Rhat_target)&(stan_thin<65)){ fitU1=stan(fit = fitU, data = data.u,init=initU, seed = 12345,iter=stan_iter,thin=stan_thin, chains = 4,pars=pars.list, control=list(adapt_delta=.99)) fitU1_summary=as.matrix(summary(fitU1)$summary)[,c(1,3,10)] mrh=subset(fitU1_summary,is.na(fitU1_summary[,3])==F) max_Rhat=round(max(mrh[,3]),4) mean_lp=round(fitU1_summary[dim(fitU1_summary)[1],1],2) print(paste("Maximum Rhat =",max_Rhat," Mean lp__ =",mean_lp, "Thin =",stan_thin)) stan_thin=2stan_thin stan_iter=2stan_iter }

print(fitU1_summary)

#

optional diagnostics

#

plot(fitU1)

traceplot(fitU1)

#

extract information from stan output to process in R

# b=extract(fitU1) alpha=b$alpha tau=b$tau beta=b$beta gamma=b$gamma delta=b$delta logelr=b$logelr sig=b$sig #

simulate outcomes for d=10 using parallel processing

# set.seed(12345) cl <- makePSOCKcluster(4) registerDoParallel(cl) at.wd10=foreach (k=1:length(gamma),.combine=rbind) %dopar%{ atv=rep(0,11) for (w in 1:10){ atv[w]=exp(log(rdata$premium[w])+logelr[k]+alpha[k,w]+sig[k,10]^2/2) } w=11 alpha_new=rnorm(1,alpha[k,10],tau[k]) atv[11]=exp(log(rdata$premium[10])+logelr[k]+alpha_new+sig[k,10]^2/2) at=atv } stopCluster(cl)

#

calculate loss statistics and output to data frame

# Premium=subset(rdata,rdata$d==1)$premium ss.wd10=rep(0,11) ms.wd10=rep(0,11) # ms.wd10[1]=mean(at.wd10[,1]) for (w in 2:11){ ms.wd10[w]=mean(at.wd10[,w]) ss.wd10[w]=sd(at.wd10[,w]) } Pred.CSR=rowSums(at.wd10) ms.td10=mean(Pred.CSR) ss.td10=sd(Pred.CSR) CSR.Estimate=round(ms.wd10) CSR.SE=round(ss.wd10) CSR.CV=round(CSR.SE/CSR.Estimate,4) #

put CSR accident year statistics into a data frame

# CSR.Estimate=c(CSR.Estimate,round(ms.td10)) CSR.SE=c(CSR.SE,round(ss.td10)) CSR.CV=c(CSR.CV,round(ss.td10/ms.td10,4)) Premium=c(Premium,Premium[10],sum(Premium)) Segment=rep(segment[iseg],12) ID=rep(insurer_id[iID],12) W=c(1:11,"Total") risk=data.frame(ID,Segment,W,Premium,CSR.Estimate,CSR.SE,CSR.CV) print(risk)

write.csv(risk,file=outfilename,row.names=F)

#

plot items of interest

# par(mfrow=c(1,1)) hist(Pred.CSR,main="Posterior Distribution of Outcomes",xlab="Outcome", breaks=50,sub=paste("Mean=",CSR.Estimate[11]," Std. Dev.=",CSR.SE[11])) #

graphical diagnostics

# resid=NULL ay=NULL dy=NULL cy=NULL w=rdata$w d=rdata$d nsamp=100 samp=sample(1:length(gamma),100) for (s in samp){ speedup=rep(1,10) for (i in 2:10){ speedup[i]=speedup[i-1](1-gamma[s]-(i-2)delta[s]) } for (i in 1:dim(rdata)[1]){ mu=log(rdata$premium[w[i]])+logelr[s]+alpha[s,w[i]]+ beta[s,d[i]]*speedup[w[i]] resid=c(resid,(log(rdata$cpdloss[i])-mu)/sig[s,d[i]]) ay=c(ay,w[i]) dy=c(dy,d[i]) cy=c(cy,w[i]+d[i]-1) } } plotmain=paste(segment[iseg],insurer_id[iID]) par(mfrow=c(1,3)) plot(ay,resid,main=plotmain,xlab="Accident Year", ylab="Standardized Residuals",ylim=c(-3.5,3.5)) abline(0,0) plot(dy,resid,main=plotmain,xlab="Development Year", ylab="Standardized Residuals",ylim=c(-3.5,3.5)) abline(0,0) plot(cy,resid,main=plotmain,xlab="Calendar Year", ylab="Standardized Residuals",ylim=c(-3.5,3.5)) abline(0,0) # loglik1=extract_log_lik(fitU1) loo1 <- loo(loglik1) print(loo1, digits = 3) waic1<-waic(loglik1) print(waic1,digits=3) # t1=Sys.time() print(t1-t0)

GGMeyers commented 7 years ago

I must have inadvertently closed the issue

aadler commented 7 years ago

Hi, Dr. Meyers. This looks like it is related to these issues: https://github.com/stan-dev/rstan/issues/372 and https://github.com/stan-dev/rstan/issues/389. For the time being, you may have to adjust your makevars until the Boost bug is fixed and then rstan can pick up the correction downstream.

martinmodrak commented 7 years ago

I would like to point out, that I have had the exact same issue and was able to replicate it with an empty model (just empty data, parameters and model blocks), so this is not model-dependent. Adding -Wno-macro-redefined to the Makevars file did indeed solve the problem.

The problem also did not manifest when I compiled without optimizations (-O1 -g)

cimentadaj commented 6 years ago

Happened to me as well but fixed it immediately with #372