stan-dev / rstanarm

rstanarm R package for Bayesian applied regression modeling
https://mc-stan.org/rstanarm
GNU General Public License v3.0
388 stars 133 forks source link

Very small bug in vignette code #246

Open jgellar opened 6 years ago

jgellar commented 6 years ago

Summary:

In the "How to use the rstanarm package" vignette, there is a line that calls shinystan. This code throws an error.

Description:

The error is as follows:

Error in validate_y(y) : NAs not allowed in 'y'.

The reason this occurs is because there is a row in the dataset that has both agree and disagree equal to 0. The model fits fine, but the shinystan code throws the error. If you remove this line from the dataset before fitting the model, there is no error. You likely want to fix this within shinystan, but since it's an rstanarm vignette I was running, I'm posting it here.

Reproducible Steps:

data("womensrole", package = "HSAUR3")
womensrole$total <- womensrole$agree + womensrole$disagree
library(rstanarm)
CORES <- 4
SEED <- 12345
CHAINS <- 4
womensrole_bglm_1 <- stan_glm(cbind(agree, disagree) ~ education + gender,
                              data = womensrole,
                              family = binomial(link = "logit"), 
                              prior = student_t(df = 7), 
                              prior_intercept = student_t(df = 7),
                              chains = CHAINS, cores = CORES, seed = SEED)

library(shinystan)
launch_shinystan(womensrole_bglm_1)

RStanARM Version:

2.15.3 (shinystan version 2.3.0)

R Version:

3.4.0

Operating System:

Windows 10

jgabry commented 6 years ago

@jgellar Thanks for reporting and sorry for the slow response. It's been a busy month for us with StanCon. I think this is actually an rstanarm issue not a shinystan issue, so this is definitely the right place to report the issue. The error is occurring when shinystan internally calls rstanarm's pp_check function to do graphical posterior predictive checks. You'll get the same error if you call pp_check directly instead of launch_shinystan, i.e.,

pp_check(womensrole_bglm_1)

So I think it's something that I can fix in pp_check and then it should be fine with shinystan. Thanks again for letting us know.

kejiashi commented 6 years ago

@jgabry Hi Jonah,

I used both launch_shinystan function and pp_check function and get the same issue. The model (preference_2) I fit is a MRP model with lots of demographic features. The error information and test results are here. Did I miss something?

Thank you very much!

Code and results:

> table(preference_2$y)

   0    1    2    3    4    5    6    7    8    9 
3183 2146  488  170   68   26   12   10    1    2 
> table(is.na(preference_2$y))

FALSE 
 6106 
> launch_shinystan(preference_2)

Hang on... preparing graphical posterior predictive checks for rstanarm model.
See help('shinystan', 'rstanarm') for how to disable this feature.
Error in validate_y(y) : NAs not allowed in 'y'.
> pp_check(preference_2)
Error in validate_y(y) : NAs not allowed in 'y'.
jgabry commented 6 years ago

I still need to fix this, but in the meantime a workaround for creating the pp_check plots is to call the underlying bayesplot functions directly and replace any NAs. For example:

y <- womensrole$agree
yrep <- posterior_predict(womensrole_bglm_1)

trials <- womensrole$agree + womensrole$disagree
y_prop <- y / trials  # proportions
y_prop[24] <- 0
yrep_prop <- sweep(yrep, 2, trials, "/")
yrep_prop[, 24] <- 0

library(bayesplot)
ppc_dens_overlay(y_prop, yrep_prop[1:25, ])
jgabry commented 6 years ago

Actually, it's also possible to get shinystan to launch if you tell it not to draw from the posterior predictive distribution:

launch_shinystan(womensrole_bglm_1, ppd=FALSE)
robertlynch66 commented 5 years ago

Has this been resolved yet? I am having the same problem using 'pp_check' function where I get the same error: 'Error in validate_y(y) : NAs not allowed in 'y'. and I cant seem to get the workaround to replace the NA's in my models