StructuralEquationModels / StructuralEquationModels.jl

A fast and flexible Structural Equation Modelling Framework
https://structuralequationmodels.github.io/StructuralEquationModels.jl/dev/
MIT License
45 stars 6 forks source link

How about providing z-test and p-value? #215

Open dsryu0822 opened 2 months ago

dsryu0822 commented 2 months ago

Hi, I'm new to SEM and strugled in the model inspection as a statistician. From A first model, I want check some statistics, at least p-value. Of course I can calculate p-value with se_hessian manually, but I had to waste few ours to find p-values of each estimates natively(StructuralEquationModels.p_value is $\chi^{2}$-statistic so only for whole model, right?).

Example

Please compare results with the lavaan tutorial.

julia> result
34×7 DataFrame
 Row │ from    parameter_type  to      estimate   se         z           p
     │ Symbol  Symbol          Symbol  Float64    Float64    Float64     Float64
─────┼─────────────────────────────────────────────────────────────────────────────────
   1 │ ind60   →               x1      0.0        0.0        NaN         NaN
   2 │ ind60   →               x2      2.18037    0.13995     15.5797      0.0
   3 │ ind60   →               x3      1.81853    0.153217    11.869       0.0
   4 │ dem60   →               y1      0.0        0.0        NaN         NaN
   5 │ dem60   →               y2      1.25673    0.18675      6.72947     1.70284e-11
   6 │ dem60   →               y3      1.05773    0.149329     7.0832      1.40865e-12
   7 │ dem60   →               y4      1.26476    0.151739     8.33508     0.0
   8 │ dem65   →               y5      0.0        0.0        NaN         NaN
   9 │ dem65   →               y6      1.18569    0.172503     6.87345     6.26699e-12
  10 │ dem65   →               y7      1.27952    0.161461     7.92462     2.22045e-15
  11 │ dem65   →               y8      1.26594    0.164316     7.70432     1.31006e-14
  12 │ ind60   →               dem60   1.48308    0.399938     3.70828     0.000208675
  13 │ dem60   →               dem65   0.837334   0.0994521    8.41947     0.0
  14 │ ind60   →               dem65   0.572362   0.235309     2.43239     0.0149996
  15 │ x1      ↔               x1      0.0826508  0.0201159    4.10873     3.97839e-5
  16 │ x2      ↔               x2      0.121427   0.0713891    1.70091     0.0889594
  17 │ x3      ↔               x3      0.473013   0.0909974    5.19809     2.01345e-7
  18 │ y1      ↔               y1      1.91686    0.478265     4.00795     6.12477e-5
  19 │ y2      ↔               y2      7.47309    1.37326      5.44188     5.27223e-8
  20 │ y3      ↔               y3      5.13556    0.987856     5.19869     2.00697e-7
  21 │ y4      ↔               y4      3.19058    0.771185     4.13724     3.51506e-5
  22 │ y5      ↔               y5      2.38269    0.498673     4.77805     1.77e-6
  23 │ y6      ↔               y6      5.02101    0.913668     5.49544     3.89744e-8
  24 │ y7      ↔               y7      3.4778     0.74263      4.68309     2.8258e-6
  25 │ y8      ↔               y8      3.29809    0.721394     4.57184     4.83468e-6
  26 │ ind60   ↔               ind60   0.454495   0.0885065    5.13516     2.81908e-7
  27 │ dem60   ↔               dem60   4.00959    0.963841     4.16002     3.18226e-5
  28 │ dem65   ↔               dem65   0.174816   0.224789     0.777688    0.436753
  29 │ y1      ↔               y5      0.632063   0.376484     1.67886     0.0931798
  30 │ y2      ↔               y4      1.33093    0.713031     1.86658     0.0619609
  31 │ y2      ↔               y6      2.18217    0.741138     2.94435     0.00323632
  32 │ y3      ↔               y7      0.805616   0.633766     1.27116     0.203673
  33 │ y8      ↔               y4      0.352961   0.467182     0.755512    0.449942
  34 │ y8      ↔               y6      1.37453    0.583618     2.35519     0.0185131

Code

using StructuralEquationModels

observed_vars = [:x1, :x2, :x3, :y1, :y2, :y3, :y4, :y5, :y6, :y7, :y8]
latent_vars = [:ind60, :dem60, :dem65]

graph = @StenoGraph begin

    # loadings
    ind60 → fixed(1)*x1 + x2 + x3
    dem60 → fixed(1)*y1 + y2 + y3 + y4
    dem65 → fixed(1)*y5 + y6 + y7 + y8

    # latent regressions
    ind60 → dem60
    dem60 → dem65
    ind60 → dem65

    # variances
    _(observed_vars) ↔ _(observed_vars)
    _(latent_vars) ↔ _(latent_vars)

    # covariances
    y1 ↔ y5
    y2 ↔ y4 + y6
    y3 ↔ y7
    y8 ↔ y4 + y6

end

partable = ParameterTable(
    latent_vars = latent_vars,
    observed_vars = observed_vars,
    graph = graph)

data = example_data("political_democracy")
model = Sem(
    specification = partable,
    data = data
)
model_fit = sem_fit(model)
fit_measures(model_fit)
sem_summary(model_fit)
update_estimate!(partable, model_fit)
update_partable!(partable, model_fit, se_hessian(model_fit), :se)
sem_summary(partable)

using DataFrames, Distributions

result = DataFrame(partable)[:, [:from, :parameter_type, :to, :estimate, :se]]
result[!, :z] = abs.(result.estimate ./ result.se)
result[!, :p] = 2(1 .- cdf.(Normal(), abs.(result.z)))

result

Enviroment

Maximilian-Stefan-Ernst commented 2 months ago

Hi! Thanks for the feedback. Yes, StructuralEquationModels.p_value is for the whole model. I think it would be a nice idea to add this to the documentation. Until I find time to do that, I would leave your issue open as a feature request.