vincentarelbundock / marginaleffects

R package to compute and plot predictions, slopes, marginal means, and comparisons (contrasts, risk ratios, odds, etc.) for over 100 classes of statistical and ML models. Conduct linear and non-linear hypothesis tests, or equivalence tests. Calculate uncertainty estimates using the delta method, bootstrapping, or simulation-based inference
https://marginaleffects.com
Other
392 stars 43 forks source link

plot_predictions() does not maintain factor ordering #1109

Closed andrewheiss closed 1 month ago

andrewheiss commented 1 month ago

When using plot_predictions() with a model with an ordered factor outcome variable, the ordering is lost when plotting.

Here's a reprex:

library(marginaleffects)
library(ggplot2)

categories <- c(
  "Strongly disagree",
  "Disagree",
  "Neither agree nor disagree",
  "Agree",
  "Strongly agree"
)

set.seed(1234)
df <- data.frame(
  answer = sample(categories, 500, replace = TRUE),
  fav_color = sample(c("red", "blue", "green"), 500, replace = TRUE)
)
df$answer <- factor(df$answer, levels = categories, ordered = TRUE)

model <- MASS::polr(answer ~ fav_color, data = df, Hess = TRUE)

These are in the right order because {marginaleffects} makes them that way internally, but the column is just a character, so the order is fragile:

preds <- avg_predictions(model)
preds
#> 
#>                       Group Estimate Std. Error    z Pr(>|z|)     S 2.5 % 97.5 %
#>  Strongly disagree             0.192     0.0176 10.9   <0.001  89.5 0.157  0.226
#>  Disagree                      0.184     0.0173 10.6   <0.001  85.1 0.150  0.218
#>  Neither agree nor disagree    0.226     0.0187 12.1   <0.001 109.2 0.190  0.263
#>  Agree                         0.196     0.0177 11.0   <0.001  91.7 0.161  0.231
#>  Strongly agree                0.202     0.0179 11.2   <0.001  95.1 0.167  0.237
#> Columns: group, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
#> Type:  probs

The group facets are in alphabetic order here:

plot_predictions(model, condition = "fav_color") +
  facet_wrap(vars(group))

image

We can re-specify the levels and order for group manually:

plot_predictions(model, condition = "fav_color") +
  facet_wrap(vars(factor(group, levels = categories)))

image


{dplyr} and {tibble} maintain the column class when grouping/summarizing/etc:

# dplyr keeps things as ordered factors internally
library(dplyr)

df1 <- df |> 
  group_by(answer) |> 
  summarize(n = n())
df1
#> # A tibble: 5 × 2
#>   answer                         n
#>   <ord>                      <int>
#> 1 Strongly disagree             96
#> 2 Disagree                      92
#> 3 Neither agree nor disagree   113
#> 4 Agree                         98
#> 5 Strongly agree               101

class(df1$answer)
#> [1] "ordered" "factor"

But maybe {data.table} or whatever {marginaleffects} is using behind the scenes doesn't do that? (or is philosophically opposed to doing that?; idk anything about {data.table}). Factors (and especially ordered factors) are weird and unwieldy.


Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.3.3 (2024-02-29) #> os macOS Sonoma 14.4.1 #> system aarch64, darwin20 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz America/New_York #> date 2024-05-02 #> pandoc 3.1.11 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/aarch64/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> backports 1.4.1 2021-12-13 [1] CRAN (R 4.3.0) #> checkmate 2.3.1 2023-12-04 [1] CRAN (R 4.3.1) #> cli 3.6.2 2023-12-11 [1] CRAN (R 4.3.1) #> colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0) #> curl 5.2.0 2023-12-08 [1] CRAN (R 4.3.1) #> data.table 1.15.4 2024-03-30 [1] CRAN (R 4.3.1) #> digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0) #> dplyr * 1.1.4 2023-11-17 [1] CRAN (R 4.3.1) #> evaluate 0.23 2023-11-01 [1] CRAN (R 4.3.1) #> fansi 1.0.6 2023-12-08 [1] CRAN (R 4.3.1) #> farver 2.1.1 2022-07-06 [1] CRAN (R 4.3.0) #> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0) #> fs 1.6.3 2023-07-20 [1] CRAN (R 4.3.0) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0) #> ggplot2 * 3.5.0 2024-02-23 [1] CRAN (R 4.3.1) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0) #> gtable 0.3.4 2023-08-21 [1] CRAN (R 4.3.0) #> highr 0.10 2022-12-22 [1] CRAN (R 4.3.0) #> htmltools 0.5.7 2023-11-03 [1] CRAN (R 4.3.1) #> insight 0.19.10 2024-03-22 [1] CRAN (R 4.3.1) #> knitr 1.45 2023-10-30 [1] CRAN (R 4.3.1) #> labeling 0.4.3 2023-08-29 [1] CRAN (R 4.3.0) #> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.3.1) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) #> marginaleffects * 0.19.0.1 2024-04-16 [1] https://vincentarelbundock.r-universe.dev (R 4.3.3) #> MASS 7.3-60.0.1 2024-01-13 [1] CRAN (R 4.3.3) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0) #> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0) #> purrr 1.0.2 2023-08-10 [1] CRAN (R 4.3.1) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.3.0) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.3.0) #> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.3.0) #> R.utils 2.12.3 2023-11-18 [1] CRAN (R 4.3.1) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0) #> Rcpp 1.0.12 2024-01-09 [1] CRAN (R 4.3.1) #> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.3.0) #> rlang 1.1.3 2024-01-10 [1] CRAN (R 4.3.1) #> rmarkdown 2.25 2023-09-18 [1] CRAN (R 4.3.1) #> rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0) #> scales 1.3.0 2023-11-28 [1] CRAN (R 4.3.1) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0) #> styler 1.10.2 2023-08-29 [1] CRAN (R 4.3.0) #> tibble 3.2.1 2023-03-20 [1] CRAN (R 4.3.0) #> tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0) #> utf8 1.2.4 2023-10-22 [1] CRAN (R 4.3.1) #> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.3.1) #> withr 2.5.2 2023-10-30 [1] CRAN (R 4.3.1) #> xfun 0.41 2023-11-01 [1] CRAN (R 4.3.1) #> xml2 1.3.6 2023-12-04 [1] CRAN (R 4.3.1) #> yaml 2.3.8 2023-12-11 [1] CRAN (R 4.3.1) #> #> [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library #> #> ────────────────────────────────────────────────────────────────────────────── ```
vincentarelbundock commented 1 month ago

Thanks for the report. Could you please try version 0.19.0.5 from Github?