easystats / report

:scroll: :tada: Automated reporting of objects in R
https://easystats.github.io/report/
Other
693 stars 70 forks source link

report(stan_glm) produces output twice #286

Closed fschaffner closed 1 year ago

fschaffner commented 2 years ago

Hi, thanks for creating and maintaining the package!

Reporting stan_glm() models currently produces the same output twice. See this example from your README:

library(report)
library(rstanarm)
#> Loading required package: Rcpp
#> This is rstanarm version 2.21.3
#> - See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!
#> - Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.
#> - For execution on a local, multicore CPU with excess RAM we recommend calling
#>   options(mc.cores = parallel::detectCores())

model <- stan_glm(mpg ~ qsec + wt, data = mtcars, refresh = 0)
report(model)
#> We fitted a Bayesian linear model (estimated using MCMC sampling with 4 chains
#> of 2000 iterations and a warmup of 1000) to predict mpg with qsec and wt
#> (formula: mpg ~ qsec + wt). Priors over parameters were set as normal (mean =
#> 0.00, SD = 8.43) distributions. The model's explanatory power is substantial
#> (R2 = 0.81, 95% CI [0.70, 0.89], adj. R2 = 0.79). The model's intercept,
#> corresponding to qsec = 0 and wt = 0, is at 19.74 (95% CI [9.11, 30.04]).
#> Within this model:
#> 
#>   - The effect of qsec (Median = 0.93, 95% CI [0.41, 1.45]) has a 99.98%
#> probability of being positive (> 0), 99.10% of being significant (> 0.30), and
#> 0.22% of being large (> 1.81). The estimation successfully converged (Rhat =
#> 1.000) and the indices are reliable (ESS = 3644)
#>   - The effect of wt (Median = -5.05, 95% CI [-6.06, -4.01]) has a 100.00%
#> probability of being negative (< 0), 100.00% of being significant (< -0.30),
#> and 100.00% of being large (< -1.81). The estimation successfully converged
#> (Rhat = 0.999) and the indices are reliable (ESS = 4002)
#> 
#> Following the Sequential Effect eXistence and sIgnificance Testing (SEXIT)
#> framework, we report the median of the posterior distribution and its 95% CI
#> (Highest Density Interval), along the probability of direction (pd), the
#> probability of significance and the probability of being large. The thresholds
#> beyond which the effect is considered as significant (i.e., non-negligible) and
#> large are |0.30| and |1.81| (corresponding respectively to 0.05 and 0.30 of the
#> outcome's SD). Convergence and stability of the Bayesian sampling has been
#> assessed using R-hat, which should be below 1.01 (Vehtari et al., 2019), and
#> Effective Sample Size (ESS), which should be greater than 1000 (Burkner, 2017).
#> and We fitted a Bayesian linear model (estimated using MCMC sampling with 4
#> chains of 2000 iterations and a warmup of 1000) to predict mpg with qsec and wt
#> (formula: mpg ~ qsec + wt). Priors over parameters were set as normal (mean =
#> 0.00, SD = 15.40) distributions. The model's explanatory power is substantial
#> (R2 = 0.81, 95% CI [0.70, 0.89], adj. R2 = 0.79). The model's intercept,
#> corresponding to qsec = 0 and wt = 0, is at 19.74 (95% CI [9.11, 30.04]).
#> Within this model:
#> 
#>   - The effect of qsec (Median = 0.93, 95% CI [0.41, 1.45]) has a 99.98%
#> probability of being positive (> 0), 99.10% of being significant (> 0.30), and
#> 0.22% of being large (> 1.81). The estimation successfully converged (Rhat =
#> 1.000) and the indices are reliable (ESS = 3644)
#>   - The effect of wt (Median = -5.05, 95% CI [-6.06, -4.01]) has a 100.00%
#> probability of being negative (< 0), 100.00% of being significant (< -0.30),
#> and 100.00% of being large (< -1.81). The estimation successfully converged
#> (Rhat = 0.999) and the indices are reliable (ESS = 4002)
#> 
#> Following the Sequential Effect eXistence and sIgnificance Testing (SEXIT)
#> framework, we report the median of the posterior distribution and its 95% CI
#> (Highest Density Interval), along the probability of direction (pd), the
#> probability of significance and the probability of being large. The thresholds
#> beyond which the effect is considered as significant (i.e., non-negligible) and
#> large are |0.30| and |1.81| (corresponding respectively to 0.05 and 0.30 of the
#> outcome's SD). Convergence and stability of the Bayesian sampling has been
#> assessed using R-hat, which should be below 1.01 (Vehtari et al., 2019), and
#> Effective Sample Size (ESS), which should be greater than 1000 (Burkner, 2017).

Created on 2022-09-09 with reprex v2.0.2

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.2.1 (2022-06-23) #> os macOS Big Sur ... 10.16 #> system x86_64, darwin17.0 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Europe/Zurich #> date 2022-09-09 #> pandoc 2.18 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.2.0) #> base64enc 0.1-3 2015-07-28 [1] CRAN (R 4.2.0) #> bayesplot 1.9.0 2022-03-10 [1] CRAN (R 4.2.0) #> bayestestR 0.12.1 2022-05-02 [1] CRAN (R 4.2.0) #> boot 1.3-28 2021-05-03 [1] CRAN (R 4.2.1) #> callr 3.7.2 2022-08-22 [1] CRAN (R 4.2.0) #> cli 3.4.0 2022-09-08 [1] CRAN (R 4.2.1) #> coda 0.19-4 2020-09-30 [1] CRAN (R 4.2.0) #> codetools 0.2-18 2020-11-04 [1] CRAN (R 4.2.1) #> colorspace 2.0-3 2022-02-21 [1] CRAN (R 4.2.0) #> colourpicker 1.1.1 2021-10-04 [1] CRAN (R 4.2.0) #> crayon 1.5.1 2022-03-26 [1] CRAN (R 4.2.0) #> crosstalk 1.2.0 2021-11-04 [1] CRAN (R 4.2.0) #> datawizard 0.5.1 2022-08-17 [1] CRAN (R 4.2.1) #> DBI 1.1.3 2022-06-18 [1] CRAN (R 4.2.0) #> digest 0.6.29 2021-12-01 [1] CRAN (R 4.2.0) #> dplyr 1.0.10 2022-09-01 [1] CRAN (R 4.2.0) #> DT 0.24 2022-08-09 [1] CRAN (R 4.2.0) #> dygraphs 1.1.1.6 2018-07-11 [1] CRAN (R 4.2.0) #> effectsize 0.7.0.5 2022-08-10 [1] CRAN (R 4.2.0) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.0) #> emmeans 1.8.1-1 2022-09-08 [1] CRAN (R 4.2.1) #> estimability 1.4.1 2022-08-05 [1] CRAN (R 4.2.0) #> evaluate 0.16 2022-08-09 [1] CRAN (R 4.2.0) #> fansi 1.0.3 2022-03-24 [1] CRAN (R 4.2.0) #> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.2.0) #> fs 1.5.2 2021-12-08 [1] CRAN (R 4.2.0) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.2.0) #> ggplot2 3.3.6 2022-05-03 [1] CRAN (R 4.2.0) #> ggridges 0.5.3 2021-01-08 [1] CRAN (R 4.2.0) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0) #> gridExtra 2.3 2017-09-09 [1] CRAN (R 4.2.0) #> gtable 0.3.1 2022-09-01 [1] CRAN (R 4.2.0) #> gtools 3.9.3 2022-07-11 [1] CRAN (R 4.2.0) #> highr 0.9 2021-04-16 [1] CRAN (R 4.2.0) #> htmltools 0.5.3 2022-07-18 [1] CRAN (R 4.2.0) #> htmlwidgets 1.5.4 2021-09-08 [1] CRAN (R 4.2.0) #> httpuv 1.6.6 2022-09-08 [1] CRAN (R 4.2.1) #> igraph 1.3.4 2022-07-19 [1] CRAN (R 4.2.0) #> inline 0.3.19 2021-05-31 [1] CRAN (R 4.2.0) #> insight 0.18.2.5 2022-08-15 [1] https://easystats.r-universe.dev (R 4.2.1) #> knitr 1.40 2022-08-24 [1] CRAN (R 4.2.0) #> later 1.3.0 2021-08-18 [1] CRAN (R 4.2.0) #> lattice 0.20-45 2021-09-22 [1] CRAN (R 4.2.1) #> lifecycle 1.0.2 2022-09-09 [1] CRAN (R 4.2.1) #> lme4 1.1-30 2022-07-08 [1] CRAN (R 4.2.0) #> loo 2.5.1 2022-03-24 [1] CRAN (R 4.2.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0) #> markdown 1.1 2019-08-07 [1] CRAN (R 4.2.0) #> MASS 7.3-58.1 2022-08-03 [1] CRAN (R 4.2.0) #> Matrix 1.4-1 2022-03-23 [1] CRAN (R 4.2.1) #> matrixStats 0.62.0 2022-04-19 [1] CRAN (R 4.2.0) #> mime 0.12 2021-09-28 [1] CRAN (R 4.2.0) #> miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.2.1) #> minqa 1.2.4 2014-10-09 [1] CRAN (R 4.2.0) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.2.0) #> mvtnorm 1.1-3 2021-10-08 [1] CRAN (R 4.2.0) #> nlme 3.1-159 2022-08-09 [1] CRAN (R 4.2.0) #> nloptr 2.0.3 2022-05-26 [1] CRAN (R 4.2.0) #> parameters 0.18.2 2022-08-10 [1] CRAN (R 4.2.0) #> performance 0.9.2 2022-08-10 [1] CRAN (R 4.2.0) #> pillar 1.8.1 2022-08-19 [1] CRAN (R 4.2.1) #> pkgbuild 1.3.1 2021-12-20 [1] CRAN (R 4.2.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0) #> plyr 1.8.7 2022-03-24 [1] CRAN (R 4.2.0) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.2.0) #> processx 3.7.0 2022-07-07 [1] CRAN (R 4.2.0) #> promises 1.2.0.1 2021-02-11 [1] CRAN (R 4.2.0) #> ps 1.7.1 2022-06-18 [1] CRAN (R 4.2.0) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.2.0) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.2.0) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.2.0) #> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.2.0) #> R.utils 2.12.0 2022-06-28 [1] CRAN (R 4.2.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.0) #> Rcpp * 1.0.9 2022-07-08 [1] CRAN (R 4.2.0) #> RcppParallel 5.1.5 2022-01-05 [1] CRAN (R 4.2.0) #> report * 0.5.5.1 2022-09-09 [1] Github (easystats/report@ae6ad14) #> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.2.0) #> reshape2 1.4.4 2020-04-09 [1] CRAN (R 4.2.0) #> rlang 1.0.5 2022-08-31 [1] CRAN (R 4.2.0) #> rmarkdown 2.16 2022-08-24 [1] CRAN (R 4.2.0) #> rstan 2.21.7 2022-09-08 [1] CRAN (R 4.2.1) #> rstanarm * 2.21.3 2022-04-09 [1] CRAN (R 4.2.0) #> rstantools 2.2.0 2022-04-08 [1] CRAN (R 4.2.0) #> rstudioapi 0.14 2022-08-22 [1] CRAN (R 4.2.0) #> scales 1.2.1 2022-08-20 [1] CRAN (R 4.2.1) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0) #> shiny 1.7.2 2022-07-19 [1] CRAN (R 4.2.0) #> shinyjs 2.1.0 2021-12-23 [1] CRAN (R 4.2.0) #> shinystan 2.6.0 2022-03-03 [1] CRAN (R 4.2.0) #> shinythemes 1.2.0 2021-01-25 [1] CRAN (R 4.2.0) #> StanHeaders 2.21.0-7 2020-12-17 [1] CRAN (R 4.2.0) #> stringi 1.7.8 2022-07-11 [1] CRAN (R 4.2.0) #> stringr 1.4.1 2022-08-20 [1] CRAN (R 4.2.0) #> styler 1.7.0.9001 2022-08-07 [1] Github (r-lib/styler@a3b69e4) #> survival 3.4-0 2022-08-09 [1] CRAN (R 4.2.0) #> threejs 0.3.3 2020-01-21 [1] CRAN (R 4.2.0) #> tibble 3.1.8 2022-07-22 [1] CRAN (R 4.2.0) #> tidyselect 1.1.2 2022-02-21 [1] CRAN (R 4.2.0) #> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.2.0) #> vctrs 0.4.1 2022-04-13 [1] CRAN (R 4.2.0) #> withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.0) #> xfun 0.32 2022-08-10 [1] CRAN (R 4.2.0) #> xtable 1.8-4 2019-04-21 [1] CRAN (R 4.2.0) #> xts 0.12.1 2020-09-09 [1] CRAN (R 4.2.0) #> yaml 2.3.5 2022-02-21 [1] CRAN (R 4.2.0) #> zoo 1.8-10 2022-04-15 [1] CRAN (R 4.2.0) #> #> [1] /Library/Frameworks/R.framework/Versions/4.2/Resources/library #> #> ────────────────────────────────────────────────────────────────────────────── ```
IndrajeetPatil commented 2 years ago

Thanks for the report.

Can you please try with the development version of the package from GitHub and see if the issue is still observed?

I think this has been resolved already by @etiennebacher.

etiennebacher commented 2 years ago

I think this has been resolved already by @etiennebacher.

Nope, I can reproduce. The problem actually comes from report_model:

library(report)
library(rstanarm)
#> Loading required package: Rcpp
#> This is rstanarm version 2.21.3
#> - See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!
#> - Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.
#> - For execution on a local, multicore CPU with excess RAM we recommend calling
#>   options(mc.cores = parallel::detectCores())

model <- stan_glm(mpg ~ qsec + wt, data = mtcars, refresh = 0)
report_model(model)
#> Bayesian linear model (estimated using MCMC sampling with 4 chains of 2000 iterations and a warmup of 1000) to predict mpg with qsec and wt (formula: mpg ~ qsec + wt). Priors over parameters were set as normal (mean = 0.00, SD = 8.43) distributions
#> Bayesian linear model (estimated using MCMC sampling with 4 chains of 2000 iterations and a warmup of 1000) to predict mpg with qsec and wt (formula: mpg ~ qsec + wt). Priors over parameters were set as normal (mean = 0.00, SD = 15.40) distributions

Created on 2022-09-09 with reprex v2.0.2

IndrajeetPatil commented 2 years ago

I see. I thought this was similar to the one fixed in https://github.com/easystats/report/pull/280.

fschaffner commented 2 years ago

Yes, I did try it with the development version and the bug persists. I think @etiennebacher fixed it for glm() models only.

etiennebacher commented 2 years ago

Update: it comes from report_priors(model)

etiennebacher commented 2 years ago

These two lines produce two different values for Prior_Scale, which then duplicate each text. I don't know what the text output should be so I can't fix it, but someone else probably can: https://github.com/easystats/report/blob/ae6ad14897083fd242a145457d70186f58b7870f/R/report.stanreg.R#L62-L63

library(report)
library(rstanarm)
#> Loading required package: Rcpp
#> This is rstanarm version 2.21.3
#> - See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!
#> - Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.
#> - For execution on a local, multicore CPU with excess RAM we recommend calling
#>   options(mc.cores = parallel::detectCores())

model <- stan_glm(mpg ~ qsec + wt, data = mtcars, refresh = 0)
params <- bayestestR::describe_prior(model)
params <- params[params$Parameter != "(Intercept)", ]
params
#>   Parameter Prior_Distribution Prior_Location Prior_Scale
#> 2      qsec             normal              0    8.431924
#> 3        wt             normal              0   15.399106

Created on 2022-09-09 with reprex v2.0.2

onwutochs commented 2 years ago

I'm also experiencing the same problem with report producing output 3x. Any help will be greatly appreciated.

Thanks, T.

adamhsparks commented 1 year ago

Yes, I was checking to see if there were any Issues open about this. I've got 10 duplicated summaries for one brms model with only 1 fixed effect and two random effects. This is unexpected as previously the report() worked properly and gave a single summary.

edit This behaviour is the same with the version from GitHub main branch as well.

aroberts394 commented 1 year ago

I am experiencing the same issue with a simple lm() model. The report() and report_model() functions produce output 3 times.

rempsyc commented 1 year ago

Thanks for pinpointing this issue @etiennebacher ! I was hoping to be able to unblock this issue for the report paper since our demo does duplicate the issue. But I've blocked at the same place as you. As I have no experience with Bayesian models, I'm also not sure which line should be kept from:

> params[params$Parameter != "(Intercept)", ]
  Parameter Prior_Distribution Prior_Location Prior_Scale
2      qsec             normal              0    8.431924
3        wt             normal              0   15.399106

It seems this leads to having two models in report_model, with the only difference being the SD.

> report_model(x)
[...] Priors over parameters were set as normal (mean = 0.00, SD = 8.43) distributions
[...] Priors over parameters were set as normal (mean = 0.00, SD = 15.40) distributions

So it's not a TOTAL duplicate of the report output, more like it's trying to report two separate (but almost identical) models (one for each predictor/IV). If you check the original output from report() carefully, it matches the text above.

So I think the way about fixing this issue is to clarify and decide first whether it should report two models, or only one. If one, which one of the two lines above should be kept? If two, then perhaps we can just make sure they are clearly split in separate paragraphs?

Since this is Bayesian stuff, I'm afraid we'll have to ask @DominiqueMakowski for a bit of advice here... 😬

rempsyc commented 1 year ago

What I don't understand is how does the Prior Scale value translates to standard deviation (SD) in the text? And right now report_text.stanreg is defined as report_text.lm. Do we need to have report_text.lm have its own report.text method to account for this situation? Or can it be resolved more simply?

The critical sentence is

Priors over parameters were set as normal (mean = 0.00, SD = 8.43) distributions.

So a Bayes question would be: normally when priors over parameters are set as normal, is that set for each predictor?

If so should we have the output only once, but report the mean and SD of ALL predictors? Something like:

Priors over parameters were set as normal (qsec mean = 0.00, SD = 8.43; wt mean = 0.00, SD = 15.40) distributions.

rempsyc commented 1 year ago

If going for the last option, it can be solved with a simple collapse by adding this on line 83: values <- paste0(values, collapse = "; "). Result:

devtools::load_all("c:/github/report")
library(rstanarm)
x <- stan_glm(mpg ~ qsec + wt, data = mtcars, refresh = 0, iter = 500)
report(x)

#> We fitted a Bayesian linear model (estimated using MCMC sampling with 4 chains
#> of 500 iterations and a warmup of 250) to predict mpg with qsec and wt
#> (formula: mpg ~ qsec + wt). Priors over parameters were all set as normal (mean
#> = 0.00, SD = 8.43; mean = 0.00, SD = 15.40) distributions. The model's
#> explanatory power is substantial (R2 = 0.81, 95% CI [0.69, 0.88], adj. R2 =
#> 0.79). The model's intercept, corresponding to qsec = 0 and wt = 0, is at 19.56
#> (95% CI [8.34, 31.33]). Within this model:
#> 
#>   - The effect of qsec (Median = 0.94, 95% CI [0.35, 1.52]) has a 99.90%
#> probability of being positive (> 0), 98.20% of being significant (> 0.30), and
#> 0.20% of being large (> 1.81). The estimation successfully converged (Rhat =
#> 1.002) but the indices are unreliable (ESS = 867)
#>   - The effect of wt (Median = -5.04, 95% CI [-6.01, -4.11]) has a 100.00%
#> probability of being negative (< 0), 100.00% of being significant (< -0.30),
#> and 100.00% of being large (< -1.81). The estimation successfully converged
#> (Rhat = 1.004) but the indices are unreliable (ESS = 992)
#> 
#> Following the Sequential Effect eXistence and sIgnificance Testing (SEXIT)
#> framework, we report the median of the posterior distribution and its 95% CI
#> (Highest Density Interval), along the probability of direction (pd), the
#> probability of significance and the probability of being large. The thresholds
#> beyond which the effect is considered as significant (i.e., non-negligible) and
#> large are |0.30| and |1.81| (corresponding respectively to 0.05 and 0.30 of the
#> outcome's SD). Convergence and stability of the Bayesian sampling has been
#> assessed using R-hat, which should be below 1.01 (Vehtari et al., 2019), and
#> Effective Sample Size (ESS), which should be greater than 1000 (Burkner, 2017).

Created on 2023-03-05 with reprex v2.0.2

Would that be an acceptable solution for now? If so I can push a PR and we can always revisit later when we get confirmation from Dom.

(though I don't really understand why the means would be 0 there but the SDs so huge)