moshagen / semPower

R package providing a-priori, post-hoc, and compromise power analyses for structural equation models (SEM)
7 stars 3 forks source link

list does not work with semPower.powerPath #23

Closed emafin closed 10 months ago

emafin commented 10 months ago

I have tried to test multiple effects through semPower.powerPath and the list function, as per the manual, but it does not work. I get the following error: Error in x[nullWhich[1], nullWhich[2]] : invalid subscript type 'list', which I don't get with semPower.powerCFA (the latter runs smoothly with no issues).

Reproducible example:

# Load packages
library(semPower)

# Beta matrix
# Specify the total number of variables
n_var <- 9

# Specify the number of endogenous variables (i.e., M and Y)
n_end <- 2

# Specify the beta matrix
bet <- matrix(c(
  rep(rep(0.0, n_var), n_var - n_end), # All exogenous
  c(0.20,0.20,0.20,0.20,0.00,0.00,-0.40,0.00,0.00), # M
  c(0.20,0.20,0.20,0.20,0.20,0.15,-0.30,0.40,0.00)), # Y
  byrow = TRUE, ncol = n_var)

# Define lambda matrix (diag = 1) for a set of observed variables
lam <- diag(n_var)

# Power to detect the direct effect
power_path_c <- semPower.powerPath(
  # Define type of power analysis
  type = "a-priori", alpha = 0.05, power = 0.80,
  # Define hypothesis
  Beta = bet,
  nullEffect = "beta = 0",
  nullWhich = list(c(9, 7), c(9, 8)),
  # Define measurement model
  Lambda = lam,
  # Define simulations
  simulatedPower = TRUE,
  simOptions = list(
    nReplications = 10,
    minConvergenceRate = .90)
)

Thank you.

moshagen commented 10 months ago

powerPath does not support stacking of hypotheses, so you cannot define the effect that both coefficients specified in nullWhich are zero. If you need stacked hypotheses, consider using powerPath to generate the population covariance matrix alongside with powerLav to define the proper H0 model.

If you rather wanted power to detect inequality of the coefficients defined in nullWhich, use nullEffect = "betaX = betaZ".

emafin commented 10 months ago

Thank you for the swift reply.

emafin commented 9 months ago

Apologies as I understand the issue was closed. I've tried to run semPower.powerLav but there is something I cannot understand. Let's say I have a simple mediation model X -> M -> Y. I want to determine sample size based on four null hypotheses, i.e.:

  1. Y ~ M ≠ 0
  2. Y ~ X ≠ 0
  3. M ~ X ≠ 0
  4. Indirect effect of X on Y through M ≠ 0

The following is the hypothetical true population model:

mod_pop <- "
X ~~ 1*X   
M ~~ 1*M
Y ~~ 1*Y

# Regressions
Y ~ 0.2*X + 0.2*M
M ~ 0.2*X

# Indirect
indirect := 0.2*0.2
"

I guess the H0 model would be:

mod_h0 <- "
X ~~ 1*X   
M ~~ 1*M
Y ~~ 1*Y

# Regressions
Y ~ 0*X + 0*M
M ~ 0*X

# Indirect
indirect := 0*0
"

And then I run semPower.powerLav:

power_analysis <- semPower.powerLav(type = "a-priori",
                                    alpha = 0.05,
                                    power = 0.80,
                                    modelPop = mod_pop,
                                    modelH0 = mod_h0)

But I get the error message:

Error in semPower.powerLav(type = "a-priori", alpha = 0.05, power = 0.8,  : 
  The H0 model did not converge.

The problem only resolves when I freely estimate at least one path within H0. Example:

mod_h0 <- "
X ~~ 1*X   
M ~~ 1*M
Y ~~ 1*Y

# Regressions
# Freely estimated effect of M on Y
Y ~ 0*X + M
M ~ 0*X

# Indirect
indirect := 0*0
"

I wonder how I may test the full set of hypotheses. Thank you for any advice you may be able to offer.

moshagen commented 9 months ago

The main problem is that your H0 model does not contain any free parameter, because your H0 model also fixes all the residual variances to 1, so there is nothing left to estimate for lavaan (the error message given by semPower is a bit misleading, because the issue is actually not the the model did not converge).

If you really want to place constraints on the residual variances as well, you could do a power analysis like this in your scenario:

Sigma <- fitted(lavaan::sem(mod_pop))[['cov']]
ap <- semPower.aPriori(Sigma = Sigma, SigmaHat = diag(3), df = 5, alpha = 0.05, power = 0.80)
summary(ap)

where SigmaHat is just an identity matrix, because you constrain all relations (covariances) to zero and all variances to one.

However, one typically only wants to constrain the paths, but not the residual variances, like this:

mod_h0 <- "
# Regressions
Y ~ 0*X + 0*M
M ~ 0*X
"

That aside, the the parameter reflecting the indirect effect can be omitted as well, because you just define a new parameter called "indirect" that has a value of zero, but is not a function of the regression coefficients of interest. In general, you need to define labels for the regression parameters and define the indirect effect as function of these labels. This following is equivalent to the model defined above:

# Regressions
Y ~ c*X + b*M
M ~ a*X

# Indirect
indirect := a*b
indirect == 0
a == 0
b == 0
c == 0 # direct effect 
"

Also note that your fourth null hypothesis (indirect effect of zero) is fully determined by hypothesis 1 and 3 and thus does not count as an independent hypothesis. I would recommend to reconsider whether you really expect both direct effects involving the mediator (and by implication the indirect effect) to equal zero, or whether you rather expect the indirect effect to equal zero, which only implies that one of the direct effects (but not necessarily both) is zero.

And finally, I am unsure whether you really want power for the entire set of four hypotheses. Usually one would want to test these hypotheses sequentially, as H1 - H3 are independent of each other. Then, power analyses are performed for each hypothesis separately, and one would inform sample size requirements based on the analyses yielding the largest required N.

emafin commented 9 months ago

Thank you for taking the time to respond to my query and for the detailed explanation. This is most helpful and clear, much appreciated.