easystats / parameters

:bar_chart: Computation and processing of models' parameters
https://easystats.github.io/parameters/
GNU General Public License v3.0
416 stars 37 forks source link

`include_reference` in `parameters()`? #952

Closed vincentarelbundock closed 3 months ago

vincentarelbundock commented 3 months ago

Hi,

My understanding is that include_reference=TRUE is only available in print(). Does anyone know how difficult it would be to implement a version of this that would insert a new row with NAs in the raw data frame returned when calling model_parameters()?

My goal is to make this available through modelsummary, but I only call the intermediary version of model_parameters() and not its print method.

Thanks!

strengejacke commented 3 months ago

It's actually called in the format() method and indeed modifies the data frame (see https://github.com/easystats/parameters/blob/b3a88a5db95c014a19f1e93fbcbb4a6d44f84936/R/utils_format.R#L355).

I think it should be easy to do this before so that model_parameters() can already return that modified data frame.

strengejacke commented 3 months ago
library(parameters)
model <- lm(Sepal.Length ~ Petal.Length + Species, data = iris)

model_parameters(model)
#> Parameter            | Coefficient |   SE |         95% CI | t(146) |      p
#> ----------------------------------------------------------------------------
#> (Intercept)          |        3.68 | 0.11 | [ 3.47,  3.89] |  34.72 | < .001
#> Petal Length         |        0.90 | 0.06 | [ 0.78,  1.03] |  13.96 | < .001
#> Species [versicolor] |       -1.60 | 0.19 | [-1.98, -1.22] |  -8.28 | < .001
#> Species [virginica]  |       -2.12 | 0.27 | [-2.66, -1.58] |  -7.74 | < .001
#> 
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
#>   using a Wald t-distribution approximation.

model_parameters(model, include_reference = TRUE)
#> Parameter            | Coefficient |   SE |         95% CI | t(146) |      p
#> ----------------------------------------------------------------------------
#> (Intercept)          |        3.68 | 0.11 | [ 3.47,  3.89] |  34.72 | < .001
#> Petal Length         |        0.90 | 0.06 | [ 0.78,  1.03] |  13.96 | < .001
#> Species [setosa]     |        0.00 |      |                |        |       
#> Species [versicolor] |       -1.60 | 0.19 | [-1.98, -1.22] |  -8.28 | < .001
#> Species [virginica]  |       -2.12 | 0.27 | [-2.66, -1.58] |  -7.74 | < .001
#> 
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
#>   using a Wald t-distribution approximation.

as.data.frame(model_parameters(model, include_reference = TRUE))
#>           Parameter Coefficient         SE   CI     CI_low   CI_high         t
#> 1       (Intercept)   3.6835266 0.10609608 0.95  3.4738440  3.893209 34.718780
#> 2      Petal.Length   0.9045646 0.06478559 0.95  0.7765259  1.032603 13.962436
#> 3     Speciessetosa   0.0000000         NA   NA         NA        NA        NA
#> 4 Speciesversicolor  -1.6009717 0.19346616 0.95 -1.9833277 -1.218616 -8.275203
#> 5  Speciesvirginica  -2.1176692 0.27346121 0.95 -2.6581230 -1.577215 -7.743947
#>   df_error            p
#> 1      146 1.968671e-72
#> 2      146 1.121002e-28
#> 3       NA           NA
#> 4      146 7.371529e-14
#> 5      146 1.480296e-12

Created on 2024-03-07 with reprex v2.1.0

strengejacke commented 3 months ago

Wait, it should work in combination with pretty_names = FALSE, right?

strengejacke commented 3 months ago

Ok, for developers, where speed is crucial (and thus pretty_names is set to FALSE), this now works, too:

library(parameters)
model <- lm(Sepal.Length ~ Petal.Length + Species, data = iris)

as.data.frame(model_parameters(model, include_reference = TRUE, pretty_names = FALSE))
#>           Parameter Coefficient         SE   CI     CI_low   CI_high         t
#> 1       (Intercept)   3.6835266 0.10609608 0.95  3.4738440  3.893209 34.718780
#> 2      Petal.Length   0.9045646 0.06478559 0.95  0.7765259  1.032603 13.962436
#> 3     Speciessetosa   0.0000000         NA   NA         NA        NA        NA
#> 4 Speciesversicolor  -1.6009717 0.19346616 0.95 -1.9833277 -1.218616 -8.275203
#> 5  Speciesvirginica  -2.1176692 0.27346121 0.95 -2.6581230 -1.577215 -7.743947
#>   df_error            p
#> 1      146 1.968671e-72
#> 2      146 1.121002e-28
#> 3       NA           NA
#> 4      146 7.371529e-14
#> 5      146 1.480296e-12

Created on 2024-03-07 with reprex v2.1.0

vincentarelbundock commented 3 months ago

Oh wow, this is awesome! I just tried it and it works perfect. Thanks so much!!