CecileProust-Lima / lcmm

R package lcmm
https://CecileProust-Lima.github.io/lcmm/
54 stars 13 forks source link

Removing non-significant higher-order terms, and adding covariates #276

Open e-littler opened 2 weeks ago

e-littler commented 2 weeks ago

Hello,

I am conducting some latent class growth models on hair cortisol concentration (HCC) levels in children using the hlme command:

lcga2_q_cond <- hlme(cHairCortisolConclog ~ Time+I(Time^2)+sex+age, subject = "ID", ng = 3, data = data_long, mixture = ~ Time+I(Time^2))

This is my output from this model: image

The quadratic Time term is not significant in Class 2, and I have seen other papers remove non-significant higher order terms from classes where they were not significant, which I would like to do as well (i.e., remove the quadratic time term from class 2 only). I know this is possible in SAS using PROC TRAJ and the ORDER= statement, but is this possible to do with hlme or within the lcmm package?

Additionally, I have some more theoretical questions about including covariates in latent class growth models, and would greatly appreciate any clarification:

HCC levels are affected by sex (higher in males) and age (higher in younger children), and I would like to control for these influences in the model and in the creation of the classes. Am I correct in my approach of including these covariates as fixed effects in the model as seen above? If so, would it be appropriate to conduct a multinomial logistic regression after assigning individuals to classes, to explore whether sex and age are significant predictors of group membership? Or would that not be appropriate because the classes were created controlling for these variables already?

I know it is also possible to include covariates in the mixture statement as class-specific fixed effects, but I'm having trouble understanding if that would be more appropriate or why you would want to include covariates as class-specific fixed effects?

I'm also having trouble understanding the difference between including covariates as fixed effects in the model, or including them using the classmb statement (since both approaches adjust membership in the classes as a result of including covariates). For my purposes, would including age and sex using classmb be more appropriate if I want to control for their influence? Any clarification would be greatly appreciated.

Thank You!

VivianePhilipps commented 1 week ago

Hi,

if you want to remove a term for only one latent class, you have to specify a vector of initial values with a 0 for the parameter corresponding to this term and to specify that it should not be estimated with argument posfix. In your example the term you want to fix is the 8th parameter, so you would use

m <- hlme(..., B = c(4.02, 2.41, 3.84, -0.69, 0.12, -1.46, 0.16, 0, 0.28, -0.25, -0.10, 0.87), posfix = c(8))

Concerning the covariates, if you include sex and age in classmb, you will influence the model to find latent classes that differ according to these covariates. The fixed effects will then be the residual effects of theses covariates. If you put them in mixture, you will be even more precise since you will get an effect for each class.

Viviane

e-littler commented 4 days ago

Hello Viviane,

Thank you so much for your response - this is very helpful.

I just have a follow up question, how would I implement this solution when using the gridsearch function? Or, it is possible to do so?

I tried to implement this solution with the following code, but the results don't make sense. The class membership changes dramatically, and the coefficient for Time^2 for class 2 is not zero (even though I am fixing it to be 0 in the code).

lcga3_q_cond_r <- gridsearch(rep = 100, maxiter = 10, minit = lcga1_q_cond, m=hlme(cHairCortisolConclog ~ Time+I(Time^2)+pcsex+cageForAnalysis, subject = "ID", ng = 3, data = data_long, mixture = ~ Time+I(Time^2), B = c(0.85, -0.78, 4.02, 2.41, 3.84, -0.69, 0.12, -1.46, 0.16, 0, 0.28, -0.25, -0.10, 0.87), posfix = c(10)))

Thank you!

VivianePhilipps commented 2 days ago

The gridsearch function won't use the initial values specified in B, but it will randomly generate the values using the one class model estimates. That's why the coefficient for Time^2 in class 2 is not 0. Fixing this parameter in the gridsearch is not a good idea because you don't know at which value it will be fixed (because the value is random). But if the model returned by gridseach has a very small coefficient and you want to fix it at 0, then you can run again a model (hlme without gridsearch) with B and posfix specified as is my previous answer.

Viviane