jslefche / piecewiseSEM

Piecewise Structural Equation Modeling in R
152 stars 48 forks source link

SEMs with gams produce additonal d-sep claims #267

Open annalenahendel opened 1 year ago

annalenahendel commented 1 year ago

Recently, I have started to use your development branch to fit gam-SEMs. Here I have been running into the problem that variables specified within smooth-terms are not recognized/ matched to their previously specified term. This creates the follow up problem that addittonal d-sep claims are tested (that are actually directly related) so that the SEM doesn't fit the data.

library(mgcv)

library(devtools)

devtools::install_github("jslefche/piecewiseSEM@devel") # use the

developing version library(piecewiseSEM) #piecewiseSEM version 2.2.1.

simulate data

A <- rnorm(100,mean = 10, sd = 5) B <- rnorm(100,mean = 10, sd = 5) C <- round((0.25A + 0.25B + rnbinom(100,mu=10,size=2)),0) D <- 0.1A + 0.25B + 0.5C + rnorm(100) E <- round(0.5(C) + 0.5*(D) + rnbinom(100,mu=5,size=2),0) dat <- data.frame(A,B,C,D,E)

construct model components using gams

gam.C <- mgcv::gam(C ~ s(A,k=3) + s(B,k=3),data=dat, family=negbin(1.8)) gam.D <- mgcv::gam(D ~ s(A,k=3) + s(B,k=3) + s(C,k=3),data=dat) gam.E <- mgcv::gam(E ~ s(C,k=3) + s(D,k=3),data=dat, family=negbin(14.6))

Same model with glms

glm.C <- glm.nb(C ~ A + B,data=dat) glm.D <- glm(D ~ A + B + C,data=dat) glm.E <- glm.nb(E ~ C + D,data=dat)

SEM

SEM.gam <- psem(gam.C,gam.D,gam.E) summary(SEM.gam)# no support for the model: sig. d-sep claim between: C ~ s(D, k = 3)

SEM.glm <- psem(glm.C,glm.D,glm.E) summary(SEM.glm) # support for the model

seanhardison1 commented 1 year ago

Hi @annalenahendel, this is probably too little too late, but I think I was able to solve this issue (although I will caveat this by saying that I'm not a statistician...).

The extra independence claim can be manually removed prior to estimating the $\chi^2$ statistic so that the GAM basis set aligns with the GLM basis set.

newBasisSet <- basisSet(SEM.gam)[1:2]
LLchisq(modelList = SEM.gam,
        basis.set = newBasisSet)

> LLchisq(modelList = SEM.gam,
+         basis.set = newBasisSet)

  Chisq   df P.Value
1 0.758 2.01   0.687

> LLchisq(modelList = SEM.glm,
+         basis.set = basisSet(SEM.glm))

  Chisq df P.Value
1 0.898  2   0.638

This seems reasonable, but I will defer to advice from package authors

annalenahendel commented 1 year ago

Hi @seanhardison1,

Thank you, yes this is also how I am working around this issue now... with quite some variables it gets somewhat annyoing to sort through...