Include self defined parameters like indirect effects for lavaan::sem() model output

Ohlsen commented 3 years ago

First of all, this is more of a feature request, not a exaclty a blocker for acceptance.

In the JOSS Paper lavaan::sem() is listed in the image of modeling methods supported by the parameters-package. The vignette https://stat.ethz.ch/CRAN/web/packages/parameters/vignettes/efa_cfa.html offers an explanation how parameters() works for EFA and CFA, but keeps quiet of lavaan::sem(). Structural equation models have an enormous amount of modeling possibilities. For example, it is possible to estimate indirect effects in partial mediation structures. For this, the effects have to be defined by the ':=' operator. At the moment, parameters does not extract those.

library(lavaan)
library(parameters)

set.seed(1234)
X <- rnorm(100)
M <- 0.5*X + rnorm(100)
Y <- 0.7*M + rnorm(100)
Data <- data.frame(X = X, Y = Y, M = M)
model <- ' # direct effect
             Y ~ c*X
           # mediator
             M ~ a*X
             Y ~ b*M
           # indirect effect (a*b)
             ab := a*b
           # total effect
             total := c + (a*b)
         '
fit <- sem(model, data = Data, test = "Satorra-Bentler")

summary(fit)
model_parameters(fit, standardize = TRUE)

I think, having the possibility to include those effects in the parameters-package would be a useful addition. Concerning the image in the JOSS-paper with the supported models, i would suggest to empasize the support of lavaan::cfa() instead of lavaan::sem() at the moment, because the vignette is pretty silent about lavaan::sem() and some output ist not yet available in the parameters package.

strengejacke commented 3 years ago

I added the "definition" parameters now as well:

library(lavaan)
#> This is lavaan 0.6-7
#> lavaan is BETA software! Please report any bugs.
library(parameters)

set.seed(1234)
X <- rnorm(100)
M <- 0.5*X + rnorm(100)
Y <- 0.7*M + rnorm(100)
Data <- data.frame(X = X, Y = Y, M = M)
model <- ' # direct effect
             Y ~ c*X
           # mediator
             M ~ a*X
             Y ~ b*M
           # indirect effect (a*b)
             ab := a*b
           # total effect
             total := c + (a*b)
         '
fit <- sem(model, data = Data, test = "Satorra-Bentler")

summary(fit)
#> lavaan 0.6-7 ended normally after 12 iterations
#> 
#>   Estimator                                         ML
#>   Optimization method                           NLMINB
#>   Number of free parameters                          5
#>                                                       
#>   Number of observations                           100
#>                                                       
#> Model Test User Model:
#>                                               Standard      Robust
#>   Test Statistic                                 0.000       0.000
#>   Degrees of freedom                                 0           0
#> 
#> Parameter Estimates:
#> 
#>   Standard errors                             Standard
#>   Information                                 Expected
#>   Information saturated (h1) model          Structured
#> 
#> Regressions:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>   Y ~                                                 
#>     X          (c)    0.036    0.104    0.348    0.728
#>   M ~                                                 
#>     X          (a)    0.474    0.103    4.613    0.000
#>   Y ~                                                 
#>     M          (b)    0.788    0.092    8.539    0.000
#> 
#> Variances:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>    .Y                 0.898    0.127    7.071    0.000
#>    .M                 1.054    0.149    7.071    0.000
#> 
#> Defined Parameters:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>     ab                0.374    0.092    4.059    0.000
#>     total             0.410    0.125    3.287    0.001

model_parameters(fit, standardize = TRUE)
#> # Regression type
#> 
#> Link  | Coefficient |   SE |        95% CI |      p
#> ---------------------------------------------------
#> Y ~ X |        0.03 | 0.08 | [-0.13, 0.18] | 0.728 
#> M ~ X |        0.42 | 0.08 | [ 0.26, 0.57] | < .001
#> Y ~ M |        0.68 | 0.06 | [ 0.56, 0.80] | < .001
#> 
#> # Defined type
#> 
#> Link             | Coefficient |   SE |       95% CI |      p
#> -------------------------------------------------------------
#> ab := a*b        |        0.28 | 0.06 | [0.16, 0.41] | < .001
#> total := c+(a*b) |        0.31 | 0.09 | [0.14, 0.48] | < .001

^{Created on 2020-08-26 by the reprex package (v0.3.0)}

Not sure, though, if the output still lacks some useful information? Is there more you would add to the output?

strengejacke commented 3 years ago

I improved the output a bit, here are two examples:

Example 1

library(lavaan)
#> This is lavaan 0.6-7
#> lavaan is BETA software! Please report any bugs.
library(parameters)

set.seed(1234)
X <- rnorm(100)
M <- 0.5*X + rnorm(100)
Y <- 0.7*M + rnorm(100)
Data <- data.frame(X = X, Y = Y, M = M)
model <- ' # direct effect
             Y ~ c*X
           # mediator
             M ~ a*X
             Y ~ b*M
           # indirect effect (a*b)
             ab := a*b
           # total effect
             total := c + (a*b)
         '
fit <- sem(model, data = Data, test = "Satorra-Bentler")

summary(fit)
#> lavaan 0.6-7 ended normally after 12 iterations
#> 
#>   Estimator                                         ML
#>   Optimization method                           NLMINB
#>   Number of free parameters                          5
#>                                                       
#>   Number of observations                           100
#>                                                       
#> Model Test User Model:
#>                                               Standard      Robust
#>   Test Statistic                                 0.000       0.000
#>   Degrees of freedom                                 0           0
#> 
#> Parameter Estimates:
#> 
#>   Standard errors                             Standard
#>   Information                                 Expected
#>   Information saturated (h1) model          Structured
#> 
#> Regressions:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>   Y ~                                                 
#>     X          (c)    0.036    0.104    0.348    0.728
#>   M ~                                                 
#>     X          (a)    0.474    0.103    4.613    0.000
#>   Y ~                                                 
#>     M          (b)    0.788    0.092    8.539    0.000
#> 
#> Variances:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>    .Y                 0.898    0.127    7.071    0.000
#>    .M                 1.054    0.149    7.071    0.000
#> 
#> Defined Parameters:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>     ab                0.374    0.092    4.059    0.000
#>     total             0.410    0.125    3.287    0.001

model_parameters(fit, standardize = TRUE)
#> # Regression 
#> 
#> Link      | Coefficient |   SE |        95% CI |      p
#> -------------------------------------------------------
#> Y ~ X (c) |        0.03 | 0.08 | [-0.13, 0.18] | 0.728 
#> M ~ X (a) |        0.42 | 0.08 | [ 0.26, 0.57] | < .001
#> Y ~ M (b) |        0.68 | 0.06 | [ 0.56, 0.80] | < .001
#> 
#> # Defined 
#> 
#> Link             | Coefficient |   SE |       95% CI |      p
#> -------------------------------------------------------------
#> ab := a*b        |        0.28 | 0.06 | [0.16, 0.41] | < .001
#> total := c+(a*b) |        0.31 | 0.09 | [0.14, 0.48] | < .001

Example 2

structure <- "
    # latent variable definitions
      ind60 =~ x1 + x2 + x3
      dem60 =~ y1 + a*y2 + b*y3 + c*y4
      dem65 =~ y5 + a*y6 + b*y7 + c*y8
    # regressions
      dem60 ~ ind60
      dem65 ~ ind60 + dem60
    # residual correlations
      y1 ~~ y5
      y2 ~~ y4 + y6
      y3 ~~ y7
      y4 ~~ y8
      y6 ~~ y8
  "
model <- lavaan::sem(structure, data = PoliticalDemocracy)

summary(model)
#> lavaan 0.6-7 ended normally after 66 iterations
#> 
#>   Estimator                                         ML
#>   Optimization method                           NLMINB
#>   Number of free parameters                         31
#>   Number of equality constraints                     3
#>                                                       
#>   Number of observations                            75
#>                                                       
#> Model Test User Model:
#>                                                       
#>   Test statistic                                40.179
#>   Degrees of freedom                                38
#>   P-value (Chi-square)                           0.374
#> 
#> Parameter Estimates:
#> 
#>   Standard errors                             Standard
#>   Information                                 Expected
#>   Information saturated (h1) model          Structured
#> 
#> Latent Variables:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>   ind60 =~                                            
#>     x1                1.000                           
#>     x2                2.180    0.138   15.751    0.000
#>     x3                1.818    0.152   11.971    0.000
#>   dem60 =~                                            
#>     y1                1.000                           
#>     y2         (a)    1.191    0.139    8.551    0.000
#>     y3         (b)    1.175    0.120    9.755    0.000
#>     y4         (c)    1.251    0.117   10.712    0.000
#>   dem65 =~                                            
#>     y5                1.000                           
#>     y6         (a)    1.191    0.139    8.551    0.000
#>     y7         (b)    1.175    0.120    9.755    0.000
#>     y8         (c)    1.251    0.117   10.712    0.000
#> 
#> Regressions:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>   dem60 ~                                             
#>     ind60             1.471    0.392    3.750    0.000
#>   dem65 ~                                             
#>     ind60             0.600    0.226    2.661    0.008
#>     dem60             0.865    0.075   11.554    0.000
#> 
#> Covariances:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>  .y1 ~~                                               
#>    .y5                0.583    0.356    1.637    0.102
#>  .y2 ~~                                               
#>    .y4                1.440    0.689    2.092    0.036
#>    .y6                2.183    0.737    2.960    0.003
#>  .y3 ~~                                               
#>    .y7                0.712    0.611    1.165    0.244
#>  .y4 ~~                                               
#>    .y8                0.363    0.444    0.817    0.414
#>  .y6 ~~                                               
#>    .y8                1.372    0.577    2.378    0.017
#> 
#> Variances:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>    .x1                0.081    0.019    4.182    0.000
#>    .x2                0.120    0.070    1.729    0.084
#>    .x3                0.467    0.090    5.177    0.000
#>    .y1                1.855    0.433    4.279    0.000
#>    .y2                7.581    1.366    5.549    0.000
#>    .y3                4.956    0.956    5.182    0.000
#>    .y4                3.225    0.723    4.458    0.000
#>    .y5                2.313    0.479    4.831    0.000
#>    .y6                4.968    0.921    5.393    0.000
#>    .y7                3.560    0.710    5.018    0.000
#>    .y8                3.308    0.704    4.701    0.000
#>     ind60             0.449    0.087    5.175    0.000
#>    .dem60             3.875    0.866    4.477    0.000
#>    .dem65             0.164    0.227    0.725    0.469

model_parameters(model)
#> # Loading 
#> 
#> Link            | Coefficient |   SE |       95% CI |      p
#> ------------------------------------------------------------
#> ind60 =~ x1     |        1.00 | 0.00 | [1.00, 1.00] | < .001
#> ind60 =~ x2     |        2.18 | 0.14 | [1.91, 2.45] | < .001
#> ind60 =~ x3     |        1.82 | 0.15 | [1.52, 2.12] | < .001
#> dem60 =~ y1     |        1.00 | 0.00 | [1.00, 1.00] | < .001
#> dem60 =~ y2 (a) |        1.19 | 0.14 | [0.92, 1.46] | < .001
#> dem60 =~ y3 (b) |        1.17 | 0.12 | [0.94, 1.41] | < .001
#> dem60 =~ y4 (c) |        1.25 | 0.12 | [1.02, 1.48] | < .001
#> dem65 =~ y5     |        1.00 | 0.00 | [1.00, 1.00] | < .001
#> dem65 =~ y6 (a) |        1.19 | 0.14 | [0.92, 1.46] | < .001
#> dem65 =~ y7 (b) |        1.17 | 0.12 | [0.94, 1.41] | < .001
#> dem65 =~ y8 (c) |        1.25 | 0.12 | [1.02, 1.48] | < .001
#> 
#> # Regression 
#> 
#> Link          | Coefficient |   SE |       95% CI |      p
#> ----------------------------------------------------------
#> dem60 ~ ind60 |        1.47 | 0.39 | [0.70, 2.24] | < .001
#> dem65 ~ ind60 |        0.60 | 0.23 | [0.16, 1.04] | 0.008 
#> dem65 ~ dem60 |        0.87 | 0.07 | [0.72, 1.01] | < .001
#> 
#> # Correlation 
#> 
#> Link     | Coefficient |   SE |        95% CI |     p
#> -----------------------------------------------------
#> y1 ~~ y5 |        0.58 | 0.36 | [-0.11, 1.28] | 0.102
#> y2 ~~ y4 |        1.44 | 0.69 | [ 0.09, 2.79] | 0.036
#> y2 ~~ y6 |        2.18 | 0.74 | [ 0.74, 3.63] | 0.003
#> y3 ~~ y7 |        0.71 | 0.61 | [-0.49, 1.91] | 0.244
#> y4 ~~ y8 |        0.36 | 0.44 | [-0.51, 1.23] | 0.414
#> y6 ~~ y8 |        1.37 | 0.58 | [ 0.24, 2.50] | 0.017

^{Created on 2020-08-26 by the reprex package (v0.3.0)}

strengejacke commented 3 years ago

I think this can be closed, the function now also returns self-defined parameters.

easystats / parameters

Include self defined parameters like indirect effects for lavaan::sem() model output #290

Example 1

Example 2