vincentarelbundock / marginaleffects

R package to compute and plot predictions, slopes, marginal means, and comparisons (contrasts, risk ratios, odds, etc.) for over 100 classes of statistical and ML models. Conduct linear and non-linear hypothesis tests, or equivalence tests. Calculate uncertainty estimates using the delta method, bootstrapping, or simulation-based inference
https://marginaleffects.com
Other
392 stars 43 forks source link

Duplicated rows in pairwise tests for character vectors? #1105

Closed strengejacke closed 1 month ago

strengejacke commented 2 months ago

See these two examples. In the first case, groups is a character vector, and in the output, Row 1 - Row 2 e.g. is duplicated. In the second example, where groups is a factor, everything looks good.

library(ggeffects)

set.seed(1234)
dat <- data.frame(
  outcome = rbinom(n = 100, size = 1, prob = 0.35),
  var_binom = as.factor(rbinom(n = 100, size = 1, prob = 0.3)),
  var_cont = rnorm(n = 100, mean = 10, sd = 7),
  groups = sample(letters[1:2], size = 100, replace = TRUE)
)
m1 <- glm(outcome ~ var_binom * groups + var_cont, data = dat, family = binomial())
d <- marginaleffects::datagrid(model = m1, by = c("var_binom", "groups"))

marginaleffects::predictions(
  m1,
  newdata = d,
  hypothesis = "pairwise"
)
#> Warning: The `type="invlink"` argument is not available unless `hypothesis` is
#>   `NULL` or a single number. The value of the `type` argument was changed
#>   to "response" automatically. To suppress this warning, use
#>   `type="response"` explicitly in your function call.
#> 
#>           Term var_binom groups Estimate Std. Error      z Pr(>|z|)   S  2.5 %
#>  Row 1 - Row 2         0      a   0.0997      0.118  0.847    0.397 1.3 -0.131
#>  Row 1 - Row 2         0      b   0.0997      0.118  0.847    0.397 1.3 -0.131
#>  Row 1 - Row 3         0      a   0.0272      0.110  0.247    0.805 0.3 -0.189
#>  Row 1 - Row 3         0      b   0.0272      0.110  0.247    0.805 0.3 -0.189
#>  Row 1 - Row 4         0      a   0.0697      0.140  0.499    0.617 0.7 -0.204
#>  Row 2 - Row 3         1      a  -0.0725      0.117 -0.620    0.535 0.9 -0.302
#>  Row 2 - Row 4         0      b  -0.0299      0.145 -0.207    0.836 0.3 -0.314
#>  Row 3 - Row 4         1      b   0.0425      0.139  0.306    0.760 0.4 -0.230
#>  97.5 %
#>   0.330
#>   0.330
#>   0.243
#>   0.243
#>   0.343
#>   0.157
#>   0.254
#>   0.315
#> 
#> Columns: rowid, term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, var_cont, var_binom, groups, outcome 
#> Type:  response

set.seed(1234)
dat <- data.frame(
  outcome = rbinom(n = 100, size = 1, prob = 0.35),
  var_binom = as.factor(rbinom(n = 100, size = 1, prob = 0.3)),
  var_cont = rnorm(n = 100, mean = 10, sd = 7),
  groups = factor(sample(letters[1:2], size = 100, replace = TRUE))
)
m1 <- glm(outcome ~ var_binom * groups + var_cont, data = dat, family = binomial())
d <- marginaleffects::datagrid(model = m1, by = c("var_binom", "groups"))

marginaleffects::predictions(
  m1,
  newdata = d,
  hypothesis = "pairwise"
)
#> Warning: The `type="invlink"` argument is not available unless `hypothesis` is
#>   `NULL` or a single number. The value of the `type` argument was changed
#>   to "response" automatically. To suppress this warning, use
#>   `type="response"` explicitly in your function call.
#> 
#>           Term Estimate Std. Error      z Pr(>|z|)   S  2.5 % 97.5 %
#>  Row 1 - Row 2   0.0997      0.118  0.847    0.397 1.3 -0.131  0.330
#>  Row 1 - Row 3   0.0272      0.110  0.247    0.805 0.3 -0.189  0.243
#>  Row 1 - Row 4   0.0697      0.140  0.499    0.617 0.7 -0.204  0.343
#>  Row 2 - Row 3  -0.0725      0.117 -0.620    0.535 0.9 -0.302  0.157
#>  Row 2 - Row 4  -0.0299      0.145 -0.207    0.836 0.3 -0.314  0.254
#>  Row 3 - Row 4   0.0425      0.139  0.306    0.760 0.4 -0.230  0.315
#> 
#> Columns: term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
#> Type:  response

Created on 2024-05-01 with reprex v2.1.0

vincentarelbundock commented 1 month ago

Thanks for the report. Could you please try version 0.19.0.6 from Github?

strengejacke commented 1 month ago

Yes, looks good! Thanks!