falkcarl / multilevelmediation

5 stars 1 forks source link

Missing values #29

Closed nvanpo closed 9 months ago

nvanpo commented 1 year ago

I'm new to this multilevelmediation and github in general.

However, I tried to build the model with the R Code but failed in the beginning (modmed.mlm), since my data has missing values.

Thank you for your help

image

falkcarl commented 1 year ago

The package is not yet designed to handle data with missing values. Depending on the pattern and amount of missing data, there may be some interim solutions. Which variable(s) have missing data and about how much is missing?

falkcarl commented 1 year ago

Also tagged as possible enhancement as this is not really a bug.

nvanpo commented 1 year ago

The package is not yet designed to handle data with missing values. Depending on the pattern and amount of missing data, there may be some interim solutions. Which variable(s) have missing data and about how much is missing?

Thanks for your response! Its a longitudinal design with 55 days for each participant. Out of 2090 rows of the dataframe with variables of interest, it seems like in 329 there are some/multiple missing values.

Is there another solution for this?

falkcarl commented 1 year ago

That seems like it might be a fair amount of missing data. I think additional clarification may help me understand better. Suppose you use stack_bpg() on the dataset. What are the dimensions of the resulting data frame and how many rows of that dataset have missing values? Are you typically missing data for the predictor, mediator, outcome all at the same time? Or is it a subset of these?

Ad-hoc ways of dealing with missing data could include removing rows that have missing data from the long version of the dataset. If that were done, however, it would probably be slightly better if the code in multilevelmediation would automatically do something like that internally after restructuring the data.

Although I don't recall lavaan supporting the types of multilevel models we wanted to estimate when we started the project, it's possible that it may now be an alternative approach that can handle missing data (e.g., https://francish.net/post/accounting-for-missing-data-mlm/). I would need more time to test it out (e.g., I don't immediately see how heteroscedasticity can be modeled); restructure the data using the Bauer, Preacher, Gil (2006) approach using stack_bpg(), and then see if lavaan can directly estimate the desired multilevel mediation analysis model. If so, the next step would be to determine how to obtain a CI for the indirect effect.

falkcarl commented 1 year ago

Perhaps even better, you might be better off using multiple imputation and brms: https://cran.r-project.org/web/packages/brms/vignettes/brms_missings.html

Note that I have not yet looked into brms, multilevel data and multiple imputation. But van Buuren has a great book with a section on multilevel multiple imputation: https://stefvanbuuren.name/fimd/

We studied use of brms for the related publication, but did not (yet) provide explicit support with the R package here. The supplementary materials should have some R code.

nvanpo commented 1 year ago

My initial attempt was to compute a multilevel mediation with random slopes, which lavaan has to date, no function for. The main issue is the missing of Mediator and Predictor, the outcome is almost complete. Handling the issue with missing values, i just created a new dataframe, with ommiting the rows with NAs. With this it "worked"

So now I'm trying to compute a multilevel mediation with z-standardized values, including multiple covariates (for M and Y), in trying so, I get another error: image

Hence I'm at that point, where I fixed the random.a, random.b and random.cprime, which leaves me with no random effects (not even a random intercept) at all?

We discussed Imputation as well, but it might unfortunately not be applicable

falkcarl commented 1 year ago

I suggest emailing me and we can follow up with respect to the particular model here. It may be an issue with how you have done the centering rather than missing data at this point, and so that may make more sense elsewhere (email, or some separate thread). I do not (yet) have a place for posting common questions or FAQ.

falkcarl commented 9 months ago

Some mention of how missing data may be handled is now available: falkcarl/multilevelmediation@aafd4a8c4ff71b31eb8f6d0e4e0ff89a95bef95e

See also the FAQ on github (the readme, scroll down: https://github.com/falkcarl/multilevelmediation) or the docs for modmed.mlm.

I missed the part about no random effects. If there are no random effects, there is no need to use a multilevel model.

Handling of within person centering is a separate issue. Therefore, closing this issue due to missing data having some solution.