easystats / parameters

:bar_chart: Computation and processing of models' parameters
https://easystats.github.io/parameters/
GNU General Public License v3.0
416 stars 37 forks source link

`include_reference = TRUE` doesn't work in combination with `as.factor()` #956

Closed snhansen closed 3 months ago

snhansen commented 3 months ago

Shouldn't the following two examples yield the same output? It appears that the reference isn't included when a variable is converted to a factor in the model statement. You get the same behaviour when the explanatory variable is a character variable probably because it's converted to a factor on-the-fly.

mtcars |>
  dplyr::mutate(gear = factor(gear)) |>
  lm(mpg ~ gear, data = _) |> 
  parameters::parameters() |>
  print(include_reference = TRUE)
#> Parameter   | Coefficient |   SE |         95% CI | t(29) |      p
#> ------------------------------------------------------------------
#> (Intercept) |       16.11 | 1.22 | [13.62, 18.59] | 13.25 | < .001
#> gear [3]    |        0.00 |      |                |       |       
#> gear [4]    |        8.43 | 1.82 | [ 4.70, 12.16] |  4.62 | < .001
#> gear [5]    |        5.27 | 2.43 | [ 0.30, 10.25] |  2.17 | 0.038

lm(mpg ~ as.factor(gear), data = mtcars) |> 
  parameters::parameters() |>
  print(include_reference = TRUE)
#> Parameter   | Coefficient |   SE |         95% CI | t(29) |      p
#> ------------------------------------------------------------------
#> (Intercept) |       16.11 | 1.22 | [13.62, 18.59] | 13.25 | < .001
#> gear [4]    |        8.43 | 1.82 | [ 4.70, 12.16] |  4.62 | < .001
#> gear [5]    |        5.27 | 2.43 | [ 0.30, 10.25] |  2.17 | 0.038
strengejacke commented 3 months ago

as.character() can still be improved...

lm(mpg ~ as.factor(gear) + factor(am) + hp, data = mtcars) |>
  parameters::parameters() |>
  print(include_reference = TRUE)
#> Parameter   | Coefficient |   SE |         95% CI | t(27) |      p
#> ------------------------------------------------------------------
#> (Intercept) |       27.48 | 1.97 | [23.43, 31.53] | 13.92 | < .001
#> gear [3]    |        0.00 |      |                |       |       
#> gear [4]    |        0.08 | 1.83 | [-3.68,  3.83] |  0.04 | 0.967 
#> gear [5]    |        2.39 | 2.38 | [-2.50,  7.29] |  1.00 | 0.324 
#> am [0]      |        0.00 |      |                |       |       
#> am [1]      |        4.14 | 1.81 | [ 0.42,  7.85] |  2.29 | 0.030 
#> hp          |       -0.06 | 0.01 | [-0.09, -0.04] | -6.24 | < .001
#> 
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
#>   using a Wald t-distribution approximation.

lm(mpg ~ as.factor(gear) + as.character(am) + hp, data = mtcars) |>
  parameters::parameters() |>
  print(include_reference = TRUE)
#> Parameter            | Coefficient |   SE |         95% CI | t(27) |      p
#> ---------------------------------------------------------------------------
#> (Intercept)          |       27.48 | 1.97 | [23.43, 31.53] | 13.92 | < .001
#> gear [3]             |        0.00 |      |                |       |       
#> gear [4]             |        0.08 | 1.83 | [-3.68,  3.83] |  0.04 | 0.967 
#> gear [5]             |        2.39 | 2.38 | [-2.50,  7.29] |  1.00 | 0.324 
#> am [0]               |        0.00 |      |                |       |       
#> as character(am) [1] |        4.14 | 1.81 | [ 0.42,  7.85] |  2.29 | 0.030 
#> hp                   |       -0.06 | 0.01 | [-0.09, -0.04] | -6.24 | < .001
#> 
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
#>   using a Wald t-distribution approximation.

Created on 2024-03-14 with reprex v2.1.0

bwiernik commented 3 months ago

I don't think as.character() is something to be concerned about, as it isn't really a modeling function.