The number of subject in the LCGA statistical model is less than that of the original dataset

Esther-ye2024 commented 5 months ago

Hello @CecileProust-Lima, thank you for developing the package!

I am using lcmm package and the function hlme() to identify latent classes in a dataset with 202 subjects. There are missing data in the outcome variables. When I run the LCGA with the hlme() function, the subject in the statistical model is 201, which is one less than that in the original model. There was no duplicate subject ID in the original dataset. However, the number of observation is correct (202*5=1010).

Here is my code and result.

lcga1 <- hlme(Outcome ~ Time, subject = 'subject',ng = 1,data = long_anxiety)

Heterogenous linear mixed model fitted by maximum likelihood method

hlme(fixed = Outcome ~ Time, subject = "subject", ng = 1, data = long_anxiety)

Statistical Model: Dataset: long_anxiety Number of subjects: 201 Number of observations: 790 Number of observations deleted: 220 Number of latent classes: 1 Number of parameters: 3

Thank you very much in advance for your assistance with this issue and kind regards, Esther

VivianePhilipps commented 5 months ago

Hi Esther,

the number of subjects indicated in the summary is the number of subjects with at least one measurement of the outcome variable. Indeed, only these subjects will be used to estimate the model. If you remove the missing values from your dataset (so keeping only the 790 observations), you should get the 201 subjects mentioned in the summary.

Viviane

Esther-ye2024 commented 5 months ago

Hi Esther,

the number of subjects indicated in the summary is the number of subjects with at least one measurement of the outcome variable. Indeed, only these subjects will be used to estimate the model. If you remove the missing values from your dataset (so keeping only the 790 observations), you should get the 201 subjects mentioned in the summary.

Viviane

Esther-ye2024 commented 5 months ago

Hi Viviane,

Thank you very much for your reply. I checked the dataset that there is one subject with missing data in all outcome variables. I did not realise the issue when I run the models. It is really helpful.

Thank you again. Esther

CecileProust-Lima / lcmm

The number of subject in the LCGA statistical model is less than that of the original dataset #259