hlme: nwg argument, interpreting model descriptors, calculating subject-specific intercept/slopes

swarupswaminathan commented 2 years ago

Hi Cecile,

I am using the lcmm package for the first time. Thank you for this fantastic package. My overall model is simple – I am studying a longitudinal testing outcome variable that changes over time based on disease progression. The patient population is heterogenous - there are non-progressors, slow progressors, and fast progressors. Hence, I hope to use the hlme call to categorize patients into latent classes. I am then interested in using the model to classify new patients appropriately (specifically, I am most interested in ensuring that fast progressors are classified correctly with as few tests as possible). I had a few questions regarding the package and its output.

1) In terms of the summary output from an hlme model, what does the ‘fixed effects in the class-membership model’ convey vs. ‘fixed effects in the longitudinal model’?

2) In my models, use of the nwg = T argument (to provide class-specific variances) seems to make the log likelihood & BIC reduce, but the entropy reduces significantly as well (see below). This would seem to suggest that the model is better, but class distinction is worse. How would you interpret this in terms of model fit?

> summarytable(md_3class, md_3class2, which = c("loglik", "BIC", "entropy", "%class"))
              loglik      BIC   entropy  %class1  %class2   %class3
md_3class  -117240.4 234586.2 0.9359884 11.08570 84.05002  4.864288
md_3class2 -115762.5 231648.1 0.6153926 50.79292 21.89692 27.310156

I also noticed that this argument appears to coerce more members into each class – without this argument, the classes are more disparate (e.g., as you can see above, some classes may have only 5% without it but with nwg = T, all classes have ~20% membership). Given this, the classes appear to be less distinct in terms of their fixed effect values:

>summary(md_3class) # nwg = F
Heterogenous linear mixed model 
     fitted by maximum likelihood method 

...

Fixed effects in the longitudinal model:

                      coef      Se     Wald p-value
intercept class1 -15.68624 0.14606 -107.399 0.00000
intercept class2  -2.26945 0.04385  -51.752 0.00000
intercept class3  -4.92480 0.21413  -22.999 0.00000
time class1   -0.19033 0.01510  -12.601 0.00000
time class2   -0.08023 0.00497  -16.142 0.00000
time class3   -1.32044 0.03481  -37.936 0.00000
...

> summary(md_3class2) # with nwg = T 
Heterogenous linear mixed model 
     fitted by maximum likelihood method 

 ...

Fixed effects in the longitudinal model:

                      coef      Se    Wald p-value
intercept class1  -0.78795 0.04676 -16.850 0.00000
intercept class2 -10.03993 0.22108 -45.414 0.00000
intercept class3  -3.50522 0.12942 -27.083 0.00000
time class1   -0.02532 0.00456  -5.549 0.00000
time class2   -0.41086 0.01985 -20.699 0.00000
time class3   -0.14744 0.01189 -12.405 0.00000
...

3) I created simulated data to test the predictClass call with a model trained on actual data. I noticed that the function appears to utilize both the slope and the intercept in classifying the subject (i.e., when I shifted all dependent variable values by 6, the classification changed). Is there any way to ignore the intercept in influencing the classification? I am interested in classifying patients primarily by their progressor status (i.e., their estimated slope).

4) When using the predictRE call to obtain random effects of new data, how can I calculate the subject-specific intercept and slopes (i.e., BLUPs)? Would these be added to the class-specific fixed effects in a weighted fashion? Per the CRAN file, the random effect predictions are averaged over classes using the posterior class-membership probabilities. Do I use the class probabilities from predictClass to provide a weight for the fixed effects and then add the random effects values from the predictRE call?

For example, here are some details on simulated data of a test patient:

> predictRE(md_3class, newdata=testset) 
id             intercept    time    
108   -2.3832044 -0.18229846 

> predictClass(md_3class, newdata = testset) 
id class        prob1       prob2     prob3  
108     3   4.524259e-01 0.003208592 0.5443655 

> summary(md_3class)
Fixed effects in the class-membership model: (the class of reference is the last class) 
coef      Se     Wald p-value 
intercept class1   0.76181 0.08986    8.477 0.00000 
intercept class2   2.76381 0.07958   34.730 0.00000 

Fixed effects in the longitudinal model: 
coef      Se     Wald p-value 
intercept class1 -15.68624 0.14606 -107.399 0.00000 
intercept class2  -2.26945 0.04385  -51.752 0.00000 
intercept class3  -4.92480 0.21413  -22.999 0.00000 
time class1   -0.19033 0.01510  -12.601 0.00000 
time class2   -0.08023 0.00497  -16.142 0.00000 
time class3   -1.32044 0.03481  -37.936 0.00000

In terms of estimated subject-specific values: Slope of id 108: .452(-0.19) + 0.003(-0.08) + 0.544(-1.32) = -0.804. Then add the RE of -0.182 = -0.986 Intercept of id 108: .452(-15.69) + 0.003(-2.27) + 0.544(-4.92) = -9.78. Then add the RE of -2.38 = -12.16

Is this correct?

5) In a similar vain, how would I calculate BLUPs of the patients in my training set (not new data)? I can extract the REs from the model using “predRE,” but do I simply add them to the fixed effects of the class they’ve been assigned to?

Thank you very much in advance!

VivianePhilipps commented 2 years ago

Hello,

For your first point, a latent class model is composed of 2 sub-models : the one modelling the probability of belonging to each latent class and the one describing the longitudinal evolution of the outcome. The class-membership model refers to the first one. So the estimations appearing in this part of the summary are the effects on the probability of belonging to the latent class. The fixed effects in the longitudinal model are the effects on the outcome's evolution.
The model with the lower BIC fits better to the data, but the model with higher entropy has a better discrimination. With latent classes, the fit is not the only criterion to take into account, so the model md_3class2 in not necessarily "the better". According to your application, you can be interesting in having a good fit or in having a good discrimination.
The classification is obtain using the whole model. You cannot use only a part of the model. But if you want to consider only different slopes between the latent classes and not different intercepts, you can remove in intercept in the mixture part of the model (mixture=~-1+time). In that way, the classes are constrained to start from the same level.

4-5. We provide in the package subject-specific predictions either for each class or average over the classes. For new data, this is achieved with the predictY function with option marg=FALSE. In the output, the pred_ss column are the average predictions, and the pred_ss_1, pred_ss_2, etc, are the predictions in each class. For the training set, the same information are provided in the pred output (md_3class2$pred for example).

Best,

Viviane

swarupswaminathan commented 2 years ago

Hi Viviane, thank you for your answers. I am a physician with a stats background, so many thanks for your patience! A few follow-up questions.

Regarding Question 3: I removed the intercept from the mixture model as you had suggested. However, I noted that the class-specific coefficients of the slope changed drastically and the optimal model changed with respect to the number of classes. Am I correct in concluding that the initial model classification using intercepts was primarily driven by the intercepts and not the slopes?

Regarding Questions 4-5: I understand that I can calculate future values of Y (which, as you said, can be predicted using predictY and the instructions provided in issue #80.) However, I was hoping to calculate the subject-specific intercept & subject-specific slope (i.e., both the class-specific fixed effect for a subject (averaged across the different classes based on the probabilities of class membership as given by postprob) & random effect components.) I can calculate the random effects (intercept & slope) using predictRE, but I can't distinguish between the intercept & slope components of the class-specific fixed effects. The marginal predictions from predictY (as discussed in issue #80) provide an estimate that combines the two. Is there any way to calculate the class-specific fixed intercept & class-specific fixed slope in order to allow me to calculate the overall subject-specific coefficients? I am interested in this data to compare the patient’s rate of change (i.e., the slope) calculated by LCMM with that of other models.

With respect to using predictY with marg = F to obtain subject-specific estimates:
I am a bit perplexed by the requirement to provide a data set (for the "new" argument in predictY) with the outcome value at the future time (i.e., if I'd like to predict the outcome at timepoint 6 based on data from timepoints 1-5, the function requires me to provide the true outcome at timepoint 6 in order to calculate the predictions.) Why must I provide the future value if I want the model to provide its prediction? As a test, I ran the model and subsequently artificially increased the outcome values of all patients at t6 by 2. The model predictions had changed, suggesting that the t6 value was influencing the model prediction. This seems to defeat the purpose. Could you clarify? Am I missing something?

CecileProust-Lima commented 2 years ago

Hi, regarding your question about the subject-specific effect, I understand that you want to compute for each subject the subject-specific effect of one covariate, let's say the time, averaged over the classes. I think what you want is:

( the sum over the classes of the class-specific fixed effect for the time (that you will find in m$best) multiplied by the posterior class-membership for this subject (that you will find in pprob) ). This gives you the subject-specific fixed effect averaged over the classes. And you want to add to that, the predicted random effect for this subject also averaged over the classes (that you will fin in predRE)

Regarding the "future values", this is not something we have added for the moment but this is indeed of great interest and will try to provide an update of predictY in the future to make it possible to predict Y values for which we do not have observations.

I hope this helps, good luck with your analyses! Cécile

swarupswaminathan commented 2 years ago

Hi Cecile - thanks for your response. If I could refer you to my original post above, I believe I did what you were suggesting (Question 4). Could you confirm if this is correct?

CecileProust-Lima commented 2 years ago

Yes, this is correct :)

swarupswaminathan commented 2 years ago

Thanks Cecile -- this worked and I confirmed that the estimates are equivalent to those estimated using the method described in #80.

Does the package allow me to estimate the p-values of the coefficients for individual subjects (not just the overall model which is provided in the summary call)? I.e., evaluation of the hypothesis test of subject-specific coefficient = 0. I am looking to analyze when the model identifies a statistically significant rate of change (different than 0) for a specific subject.

Thank you!

CecileProust-Lima commented 2 years ago

Hi, we do not specifically provide standard errors for these predictions. But you have the vector of the estimated parameters and the variance-covariance matrix of these parameters in output. So you can use this information to compute whatever you want. We usually use for this either the Delta-Method or a Monte Carlo method. For the Monte Carlo, you generate draws for the asymptotic distribution of the parameters (multivariate normal), and you compute your quantity from the generated values. You repeat that many times and you obtain the posterior distribution of the quantity you wanted to assess. Cécile

swarupswaminathan commented 2 years ago

Thanks for your note. Could you clarify how exactly the predictRE function works? Is there an accompanying paper that describes this function? I did not see it in your 2017 JSS paper but perhaps I missed this. When presented with new data, how does the function estimate random effects for specific subjects, as it has not seen these groups in the training data set?

VivianePhilipps commented 2 years ago

The random effects are estimated by the expectancy of their posterior distribution conditionally to the observed longitudinal outcome. We implement in the package the best linear unbiased prediction, which formula is given in section 5.3 of the JSS paper. These predictions can be made for the subjects of the training set as well as for new subjects. But note that for the later case you will also have to provide values for the longitudinal outcome to get the prediction of the random effects.

Viviane

swarupswaminathan commented 2 years ago

Hi, we do not specifically provide standard errors for these predictions. But you have the vector of the estimated parameters and the variance-covariance matrix of these parameters in output. So you can use this information to compute whatever you want. We usually use for this either the Delta-Method or a Monte Carlo method. For the Monte Carlo, you generate draws for the asymptotic distribution of the parameters (multivariate normal), and you compute your quantity from the generated values. You repeat that many times and you obtain the posterior distribution of the quantity you wanted to assess. Cécile

Just wanted to clarify -- are the posterior distributions for the random effects of each subject not available in the output? If not, are you suggesting that I use the standard errors of the fixed effects from the different classes + variance-covariance matrix of the random effects to calculate a standard error of the "mixed effect" (combination of the two), which I then multiply by 1.96 to produce an approximate 95% confidence interval for the subject-specific slope?

VivianePhilipps commented 2 years ago

Hi,

if you get the variance of your prediction, then with a gaussian assumption, you can build a confidence interval with pred +- 1.96sqrt(var(pred)). Here your prediction is pred = X beta + Z * E(u | Yobs) Either you compute its variance "by hand" using a delta-method, or you do it with Monte Carlo simulations. For both cases, you will need the variance of the estimated parameters available in output $V of the model. Note however that this variance matrix is associated with the parameters including the cholesky of the random effects (you get them with estimates(model)) and not directly with model$best.

Viviane

swarupswaminathan commented 2 years ago

Thank you. How would you recommend obtaining subject-specific confidence intervals for subjects used to develop the model (i.e., not new data)? The standard errors that are presented in summary are at the class-level, but I was wondering if there is a way to ascertain confidence at the subject-level?

CecileProust-Lima commented 2 years ago

Hi, For obtaining confidence Interval of a function of the parameters, you can always use the parametric bootstrap method which consists in approximating the distribution of the function you search using monte Carlo simulations. This is what we do when we want to compute confidence intervals of individual predictions of events in jointlcmm :

you draw parameters from the asymptotic distribution of the parameters (that is multivariate normal with mean the estimate and variance the variance given in V)
you compute the function you want. If this includes some random effects, you can estimate these random effects with the functions included in lcmm (by creating a replicated object in which you replace best by the draw of the parameters)
you do that many times (2000?) and you can obtain the confidence interval by looking at the 2.5 and 97.5 percentile. I hope this helps Cécile

CecileProust-Lima / lcmm

hlme: nwg argument, interpreting model descriptors, calculating subject-specific intercept/slopes #109