[Bug]: computation of standardized estimates (mediation analysis)

PedroJoelRosa commented 5 months ago

JASP Version

0.18.3

Commit ID

No response

JASP Module

SEM

What analysis are you seeing the problem on?

mediation analysis

What OS are you seeing the problem on?

Windows 10

Bug Description

Hi everyone. I' am using JASP version 0.18.3 and I performing mediation analysis via SEM menu.

I think I found a bug concerning the computation of standardized estimates.

I am preparing some slides for my students on simple mediation model (manifest variables) so, first, they have to learn the Kenny and Baron (1986) four steps, and further execute 3 linear regressions (OLS step-by-step approach) and then the SEM one-step approach.

I did the same analysis via mediation analysis (in SEM menu) and unstandardized estimates (b) are de same, but the standardized estimates (beta) were not, which was really weird.

So I transformed the original data values of my variables into z-scores and I ran it again via SEM, so the unstandardized estimates are in fact standardized estimates and I obtained the same values like I got in the OLS regressions.

My question is, what really happens in terms of computation when someone selects "standardized estimates"?

Of course the Z-score transformation is a simple workaround but the "standardized estimates" option should be working properly.

Many thanks in advance.

Expected Behaviour

In mediation analysis the unstandardized estimates (when z-scored) must the same when were working the original data values when the standardized estimates option is selected.

Its not the case.

My question is, what really happens in terms of computation when someone selects "standardized estimates"?

Of course the Z-score transformation is a simple workaround but the "standardized estimates" options should be working properly.

Steps to Reproduce

1. 2. 3. ...

Log (if any)

jaspSem::MediationAnalysis( version = "0.18.3", bootstrapSamples = 2000, mediators = "auto_est", outcomes = "felic", pathPlot = TRUE, pathPlotLegend = TRUE, pathPlotParameter = TRUE, predictors = "notas_MIPCS2", rSquared = TRUE, residualCovariance = FALSE, standardizedEstimate = TRUE, syntax = TRUE, totalIndirectEffect = FALSE)

Final Checklist

[ ] I have included a screenshot showcasing the issue, if possible.
[ ] I have included a JASP file (zipped) or data file that causes the crash/bug, if applicable.
[X] I have accurately described the bug, and steps to reproduce it.

github-actions[bot] commented 5 months ago

@PedroJoelRosa, thanks for taking the time to create this issue. If possible (and applicable), please upload to the issue website (https://github.com/jasp-stats/jasp-issues/issues/2740) a screenshot showcasing the problem, and/or a compressed (zipped) .jasp file or the data file that causes the issue. If you would prefer not to make your data publicly available, you can send your file(s) directly to us, issues@jasp-stats.org

juliuspfadt commented 5 months ago

Dear @PedroJoelRosa, thanks for creating the issue. Thats a good point. I looked into the function and what lavaan does and what we took over is to standardize the endogenous variables, instead of all the observed variables. This would be a simple fix, if we didn't also offer bootstrapping which will make the standardized CIs a bit more complicated to obtain. Ideally, lavaan would change the behavior so actually all observed variables would be standardized. Otherwise, this will take me some time

sfcheung commented 5 months ago

I have rarely used the SEM module in JASP. If I misunderstood anything about it, please correct me.

Based on the discussion and my experiment, it seems that the standardized estimates in the mediation option of the SEM module are obtained by the option std.ov in lavaan(). This option is new to me (but see my comment below). The original intended behavior (https://github.com/jasp-stats/jasp-issues/issues/2740#issue-2317857776) seems to be having all observed variables standardized before fitting the model. Without waiting for any changes in lavaan, this can be done by adding fixed.x = FALSE. Then std.ov should standardize all observed variables.

The reason, to my understanding, is the default value of fixed.x, which is TRUE. With this setting, exogenous observed variables are treated as fixed and so are not standardized even with std.ov is set to TRUE.

Althoug the intended behavior can be done in lavaan using this workaround for now, there are two issues. First, whetehr fixed.x is TRUE or FALSE affects how bootstrapping is conducted. Changing this option changes the model.

Second, it has been argued that standardizing the variables first to do the analysis is not an appropraite way to get the standardized indirect effect in mediation analysis. This method is called naive bootstrap CI in Cheung (2009)

Cheung, M. W.-L. (2009). Comparison of methods for constructing confidence intervals of standardized indirect effects. Behavior Research Methods, 41(2), 425–438. https://doi.org/10.3758/BRM.41.2.425

On p. 431, "This means that using a larger sample size does not reduce the bias introduced in the naive bootstrap CI. Given that the naive bootstrap CI is not theoretically justified, it is not advisable to use it."

On p. 427, "That is, variability in estimating V_X and V_Y is not incorporated into the calculations of the CI of the standardized indirect effect."

In the OLS approach by PROCESS, standardization is also done again for each bootstrap sample. It does not standardize the variables before doing bootstrapping.

My two cents.

juliuspfadt commented 5 months ago

Ah thanks, that is a good point. Essentially, JASP runs lavaan, so there is no difference really. We should implement actual post estimation standardisation, instead of pre-estimation.

PedroJoelRosa commented 5 months ago

Thanks for the explanation @sfcheung . That was pretty much clear !

juliuspfadt commented 2 months ago

@sfcheung, I am currently working on getting bootstrapped CIs for standardized estimates in SEM models, that includes mediation. Maybe you can help me.

From what you wrote before it seems to me a fine approach would be to fit the SEM model for a bootstrapped data sample, save the standardized estimates, repeat many times, and eventually create the CIs (in whatever manner, normal, percentile) from the bootstrapped sample of standardized estimates?

sfcheung commented 2 months ago

@juliuspfadt , sorry for my late reply. The semester has just started here.

Yes, you are right. If you want to get the correct bootstrapped CIs for the standardized estiamtes, and the standardized indirect effect computed from these estimates, you need to compute and save the standardized estiamtes generated by lavaan in each bootstrap sample. This can be done by extracting the column std.all from lavaan::parameterEstimates(). lavaan::standardizedSolution() won't work because the CIs it reports are delta-method CIs, even if se = "bootstrap" (it uses the bootstrap VCOV to form delta-method CIs).

Does the mediation module in JASP do the resampling itself? Or does it let lavaan do the resampling, e.g., setting se = "bootstrap"? If done by lavaan, then the argument fixed.x will affect how the resampling will be done. By default, fixed.x = TRUE and the exogenous observed variables are treated as fixed and so their their variances, covariances, and SDs will not change across bootstrap samples. If fixed.x = FALSE, then the SDs of exogenous observed variables also vary across bootstrap samples. The bootstrap CIs with fixed.x = FALSE and those with fixed.x = TRUE may be different.

Hope this helps.

juliuspfadt commented 2 months ago

no worries. I appreciate your input. Regarding the mediation, yes I have changed the behaviour of fixed.x for when bootstrapped standard errors are requested. I also now changed the behaviour of getting standardized bootstrap CIs for SEM so when bootstrapping at each iteration the standardized estimates are saved and then summarised to form the CIs.

jasp-stats / jasp-issues