CecileProust-Lima / lcmm

R package lcmm
https://CecileProust-Lima.github.io/lcmm/
55 stars 13 forks source link

Interpretation of classmb with fixed parameters #198

Closed tspmgh closed 11 months ago

tspmgh commented 1 year ago

I have created a hlme model with lung function as a function of age, and selected the model with 4 classess.

m4 <- hlme(fixed = zfev1 ~ age, mixture = ~age, random = ~age, subject = "id_no", ng = 4, data = lf_data, verbose = TRUE)

I am now trying to see how the identified classes relates to an unincluded variable (genetics). Hence, i have created a new hlme with the genetics (4 levels) in the classmb of the model, while keeping the parameters fixed, as explained in previous issues to avoid shifting in classes:

# model estimates
Esti <- m4$best

# Vector of initial values for np covariaates in classmb for ng classes
ng <- 4 # Number of classes
np <- 3 # Number of covariates in classmb

Binit <- c(Esti[1:(ng-1)], rep(0, np*(ng-1)), Esti[-(1:(ng-1))]) 

# Fixing parameters in the latent class model
FixedParms <- ((np+1)*(ng-1)+1):length(Binit)

# model with fixed parameters
m4_post <- hlme(zfev1 ~ age, random = ~age, subject = "id_no", data = lf_data, ng = 4, mixture = ~age, B = Binit, posfix = FixedParms, classmb = ~genetics, verbose = TRUE)

Then, as I understand it, the exp(coef) is the odds ratios for the classmb model. Hence, I can get odds ratios and 95% confidence intervals with exp(coef(m4_post)) and exp(confint(m4_post)).

Fixed effects in the class-membership model:
(the class of reference is the last class) 

                         coef        Se   Wald p-value
intercept class1      2.58793   0.82698  3.129 0.00175
intercept class2      0.04539   1.28330  0.035 0.97178
intercept class3      1.60977   0.86146  1.869 0.06167
group1 class1  -0.95373   1.12738 -0.846 0.39757
group1 class2   0.95837   1.55363  0.617 0.53733
group1 class3 -16.57808 934.20808 -0.018 0.98584
group2 class1      -0.07933   1.09603 -0.072 0.94230
group2 class2      -0.12358   1.70853 -0.072 0.94234
group2 class3      -1.32906   1.25844 -1.056 0.29092
group3 class1  -1.82020   1.04437 -1.743 0.08136
group3 class2 -12.89349 497.98885 -0.026 0.97934
group3 class3 -16.88994 707.70337 -0.024 0.98096

# Table with exp(coef(m4)_post) and exp(confint(m4_post))
group1 class1 3.853016e-01 0.04228357  3.510993
group1 class2 2.607451e+00 0.12409679 54.786260
group1 class3 6.312964e-08 0.00000000       Inf
group2 class1     9.237320e-01 0.10779519  7.915760
group2 class2     8.837485e-01 0.03104726 25.155567
group2 class3     2.647267e-01 0.02247030  3.118795
group3 class1 1.619928e-01 0.02091839  1.254478
group3 class2 2.514376e-06 0.00000000       Inf
group3 class3 4.621616e-08 0.00000000       Inf

I do not have statistical background, and I am questioning my results and whether this approach is performed correctly? Any guidance will be greatly appreciated.

All the best from a phd-student struggling in solitude with all the doubts and worries that come with the journey :-)

VivianePhilipps commented 1 year ago

Hi,

yes, you get the odds ratio by taking the exponential of the coefficients. Here you get a very small coef for 'group1 class3'. This probably means that in latent class 3 you have few subjects with the genetic characteristic 'group1'. So the model is not able to estimate the effect. And the same for 'group3 class2' and 'group3 class3'. Maybe you should keep less levels in your genetic variable, by grouping some levels if it makes sense.

Best,

Viviane