amices / mice

Multivariate Imputation by Chained Equations
https://amices.org/mice/
GNU General Public License v2.0
427 stars 107 forks source link

mice multilevel imputation for a longitudinal dataset #536

Closed tanzzzz closed 1 year ago

tanzzzz commented 1 year ago

Dear reader,

Currently I am working with a dataset of 2100 subjects with 3 time point. They have answered a questionnaire about food intake and I have missing values in both categorical and continues independant variables. Each subject has a unique id and anthropometric measurements for each time point (age, education and etc.). Moreover, at two of these three time points, MRI images from their brain have been gathered and my hypothesis is around the effect of dietary on their brain tissue volume (dependent variable). After reading the available manuals, if I understand it correctly, The ids would be the cluster level variable, the questioners and anthropometric measurements would be the 1-level independent variables, brain volume measurements 1-level dependant variable and sex the 2-level independant variable. After manipulating the prediction matrix to remove those with more than 50% missing values, and not using those with collinearity more than 0.9 for the imputation of the the other, I decided to change the methods as well. I read that I can use either 2l.norm, 2l.lmer and 2l.pan for numeric level-1 variables , a.k.a the food questionnaire and anthropometric continuous measurements . There is also the 2l.bin method for the binary ones...but what about the categorical variables? Moreover, in the prediction matrix , should I use the 2 weight for only second level variables or all I also read that we can also use mice with a wide format of longitudinal dataset, but I was wondering which is more preferable?

I hope I was able to explain my problem clearly, nevertheless if more info is needed please let me know. Your help is highly appreciated and thank you very much.

tanzzzz commented 1 year ago

I have also been trying to use mice for a cross sectional dataset. after changing the prediction matrix, and running the mice, I got this error: Error in ranger::ranger(x = xobs, y = yobs, num.trees = ntree) : Error: No covariates found. Could anyone explain this to me? I think in the manipulation and deleting some variables, I may probably make a mistake.

SavvyDev84 commented 1 year ago

Hi @tanzzzz

Error if no covariates

if (length(all.independent.variable.names) < 1) { stop("Error: No covariates found.") }

I think that there are no independent variables or you are not considering them, if you showed your code it would be interesting.

The only thing I can get ahead of is to say that you are using arguments that require another argument. mice(arg)

stefvanbuuren commented 1 year ago

If you have a dataset with three waves, then it is generally easier and more flexible to impute the wide format. See http://stefvanbuuren.name/fimd/sec-longandwide.html for an example (SE Fireworks)