easystats / performance

:muscle: Models' quality and performance metrics (R2, ICC, LOO, AIC, BF, ...)
https://easystats.github.io/performance/
GNU General Public License v3.0
1.03k stars 94 forks source link

Issues with check_model() for Stan models #354

Open bwiernik opened 3 years ago

bwiernik commented 3 years ago

I'm not exactly what the thinking is behind the current behavior of check_model() for Stan models. It calls the internal function performance:::.check_assumptions_stan(). This returns a data frame with columns Group (Prior or Posterior), y (parameter), x (value), and id (posterior draw).

I'm not exactly sure what sort of plot this is intended to be used for (perhaps a scatterplot matrix for the various parameters across posterior draws?). Could someone elaborate? @strengejacke @mattansb? Nevertheless, it is really different from the results for lm/glm models, where a list with the various diagnostic data frames is returned. This produces several problems:

  1. see::plot.see_check_model() fails for check_model(stan_model) objects

    • This function expects to find the names of plots in a list (QQ, VIF, etc.). It errors when given the single data frame from performance:::.check_assumptions_stan().
  2. There is no way for users to generate the typical regression model diagnositic plots, such as fitted-residual plots, qq plots, etc.

  3. The current performance:::.check_assumptions_stan() function fails often for brmsfit models

    • If sample_priors = "no" (default), it stops. It could instead return a data frame with just the posterior.
    • If improper priors are used (default), the reshaping fails. It should be robust to the non-sampled priors.

This leads to 2 big questions:

  1. What should check_model() do for Stan models?

    • My preference would be for it to return the same sort of plots as for lm/glm objects.
    • Could you describe what sort of plot was intended with the current output? Could we implement that as a different function (similar to pp_check())?
  2. What should we do in the mean time?

    • My suggestion would be to internally call bayestestR::convert_bayesian_as_frequentist(), then call check_model() on that. I think this would be an okay stopgap until we implement the individual plots for stan objects.

Thoughts?

bwiernik commented 3 years ago

Here is an example of the issues:

library(performance)

model_stan <- rstanarm::stan_glm(mpg ~ wt + cyl, data = mtcars)
model_brms <- brms::brm(mpg ~ wt + cyl, data = mtcars)
model_brms_wPrior <- brms::brm(mpg ~ wt + cyl, data = mtcars, sample_prior = "yes")
model_lm   <- lm(mpg ~ wt + cyl, data = mtcars)
model_glm  <- glm(mpg ~ wt + cyl, data = mtcars)

check_stan <- check_model(model_stan)
check_brms <- check_model(model_brms) # need to sample priors
check_brms_wPrior <- check_model(model_brms_wPrior) # stil fails due to improper priors
check_lm   <- check_model(model_lm)
check_glm  <- check_model(model_glm)

plot(check_stan) # fails
plot(check_lm)
plot(check_glm)
strengejacke commented 3 years ago

I would agree to your suggestions. My only question is, if the "typical" plots created by check_model() would also be useful / applicable in a Bayesian context?

strengejacke commented 3 years ago

The basic idea for this plot was actually to mimic sjPlot::plot_model(type = "diag"), to have a quick implementation of model diagnostics, which then can be enhanced further. We actually only focused on freq. models, not Bayesian. What check_model() for Stan-models is supposed to do is following:

model_stan <- rstanarm::stan_glm(mpg ~ wt + cyl, data = mtcars)
sjPlot::plot_model(model_stan, type = "diag")

Created on 2021-08-19 by the reprex package (v2.0.1)

bwiernik commented 3 years ago

I'm thinking that Residuals plots, VIF, and normality plots absolutely. These same assumptions apply to linear models regardless of the inference framework.

Influence plots should probably use a Bayesian analogue to Cook's distance or a pointwise LOO-IC--there are some papers and existing functions I found yesterday we can draw on there.

The stopgap of converting to a frequentist model for now should be generally reasonable for common cases.

strengejacke commented 3 years ago

btw, @DominiqueMakowski, this doc needs some more love, I think ;-)

strengejacke commented 3 years ago

ok, the intermediate fix should work now. I'll leave this open for later reference.

bwiernik commented 3 years ago

I am forever amazed at your speed!

I like the prior-posterior plot, but I think it would be best as another function. Thoughts on a name?

strengejacke commented 3 years ago

I like the prior-posterior plot, but I think it would be best as another function. Thoughts on a name?

I think this is that urgent, since we can already add layers of prior samples to the plot: https://easystats.github.io/see/articles/bayestestR.html#adding-prior-samples-4

bwiernik commented 3 years ago

@strengejacke Some of the tests are failing after your last commit, so the change isn't getting built on r-universe. Could you take a look?

strengejacke commented 3 years ago

This doesn't seem to be related to my commit, but rather to the tests for collinearity with afex. I think @mattansb wrote these tests, due to the change in afex 1.0 - Mattan, can you look at this? check_collinearity() is exptected to fail, but it doesn't.

if (require("testthat") && require("performance") && require("afex") && utils::packageVersion("afex") >= package_version("1.0.0")) {
  test_that("check_collinearity | afex", {
    data(obk.long, package = "afex")

    obk.long$treatment <- as.character(obk.long$treatment)
    suppressWarnings(suppressMessages({
      aM <- afex::aov_car(value ~ treatment * gender + Error(id / (phase * hour)),
                          data = obk.long
      )

      aW <- afex::aov_car(value ~ Error(id / (phase * hour)),
                          data = obk.long
      )

      aB <- afex::aov_car(value ~ treatment * gender + Error(id),
                          data = obk.long
      )
    }))

    expect_error(check_collinearity(aM))
    expect_error(check_collinearity(aW))
    expect_error(check_collinearity(aB))

    suppressWarnings(suppressMessages({
      aM <- afex::aov_car(value ~ treatment * gender + Error(id / (phase * hour)),
                          include_aov = TRUE,
                          data = obk.long
      )

      aW <- afex::aov_car(value ~ Error(id / (phase * hour)),
                          include_aov = TRUE,
                          data = obk.long
      )

      aB <- afex::aov_car(value ~ treatment * gender + Error(id),
                          include_aov = TRUE,
                          data = obk.long
      )
    }))

    expect_message(ccoM <- check_collinearity(aM))
    expect_message(ccoW <- check_collinearity(aW))
    expect_message(ccoB <- check_collinearity(aB))

    expect_equal(nrow(ccoM), 15)
    expect_equal(nrow(ccoW), 3)
    expect_equal(nrow(ccoB), 3)
  })
}
#> Loading required package: testthat
#> Loading required package: performance
#> Loading required package: afex
#> Loading required package: lme4
#> Loading required package: Matrix
#> ************
#> Welcome to afex. For support visit: http://afex.singmann.science/
#> - Functions for ANOVAs: aov_car(), aov_ez(), and aov_4()
#> - Methods for calculating p-values with mixed(): 'S', 'KR', 'LRT', and 'PB'
#> - 'afex_aov' and 'mixed' objects can be passed to emmeans() for follow-up tests
#> - NEWS: emmeans() for ANOVA models now uses model = 'multivariate' as default.
#> - Get and set global package options with: afex_options()
#> - Set orthogonal sum-to-zero contrasts globally: set_sum_contrasts()
#> - For example analyses see: browseVignettes("afex")
#> ************
#> 
#> Attaching package: 'afex'
#> The following object is masked from 'package:lme4':
#> 
#>     lmer
#> -- Failure (<text>:20:5): check_collinearity | afex ----------------------------
#> `check_collinearity(aM)` did not throw an error.
#> 
#> -- Failure (<text>:21:5): check_collinearity | afex ----------------------------
#> `check_collinearity(aW)` did not throw an error.
#> 
#> -- Failure (<text>:22:5): check_collinearity | afex ----------------------------
#> `check_collinearity(aB)` did not throw an error.
#> 
#> -- Failure (<text>:43:5): check_collinearity | afex ----------------------------
#> `ccoB <- check_collinearity(aB)` did not produce any messages.

Created on 2021-09-02 by the reprex package (v2.0.1)

mattansb commented 3 years ago

Oops, how did I miss this?

Fixed