runehaubo / lmerTestR

Repository for the R-package lmerTest
48 stars 9 forks source link

Question about model selection, step function and missing data? #50

Open HKJ396 opened 2 years ago

HKJ396 commented 2 years ago

Dear authors,

I am running lmerTest because I have longitudinal data. My outcome data is looking at breast symptoms scores at 4 time points. I have a variety of clinical/treatment fixed effects or covariates. I am using ID as a random effect since each patient reported scores at 4 different times. I successfully ran the model and the results look good.

An example of my long format data with the first 4 covariates (fixed) and outcome variable.

id       age   bmi smoking chemo outcome
1     62  29         1     0            75.7
1     62  29         1     0            100
1     62  29         1     0            75.7
1     62  29         1     0           NA 
2     50   30        0      1            70
2     50    30       0      1            80
2     50    30       0       1            20
2     50    30        0      1           100

My question is regarding missing data and the step function. I assume lmerTest filters out all the NAs/missing data at the covariate level e.g. if someone has a missing age. However, if someone has a missing outcome or in my case, missing breast symptom score at one or two time-points, the patient is still retained in the analysis.

When I carry out Step function I get an error saying:

Error: number of rows in use has changed: remove missing values?

I then go back to my data and filter out any instance of missing values/NAs which I assumes swipes out any patient that has missing data at the outcome level therefore only retaining patients with complete cases. I reran step with no error.

My question is why do I need to filter out NAs for the step function while drop1 function no such errors.