drizopoulos / JM

Joint Models for Longitudinal & Survival Data under Maximum Likelihood
34 stars 7 forks source link

Can the joint modeling frame work accept patients with different start time ? #13

Closed winterwang closed 5 years ago

winterwang commented 5 years ago

Hi, thanks for the amazing package. I am wondering in the two parts of the joint modeling framework, can they accept patients entering the study/observation from different time points?

I am dealing with data from a patient database, therefore, in a real-world setting, patients come to visit the doctor and with their blood sample drawn at totally different time points/intervals. Some patients started early some with their initial recording later.

So my question is whether the survival part of the Joint Modeling within this package can be fitted using the following form?

coxFit.DF <- coxph(Surv(start,  stop, event) ~ sex + age + group, data, x = TRUE)

Because when I fitted the above Cox model with the difference between the times they were followed time = stop-start:

coxFit.DF <- coxph(Surv(time, event) ~ sex + age + group, data, x = TRUE)

There were errors telling me that some patients had observations after the events. And when I use the Surv(start, stop, event) one, error message also came up and ask me to refit the Cox model using x = TRUE argument which I had already added.

One of the flexibility of the time-dependent Cox regression model is that patients are allowed to enter the study from different start points, If there is also a way to allow for this in the Joint Modeling framework, I would like to see how and it will be greatly appreciated.

Many thanks.

drizopoulos commented 5 years ago

Thanks for your e-mail and interest in my package. Regarding your question, please check the following:

On 5/20/2019 9:11 AM, Chaochen Wang wrote:

Hi, thanks for the amazing package. I am wondering in the two parts of the joint modeling framework, can they accept patients entering the study/observation from different time points?

I am dealing with data from a patient database, therefore, in a real-world setting, patients come to visit the doctor and with their blood sample drawn at totally different time points/intervals. Some patients started early some with their initial recording later.

So my question is whether the survival part of the Joint Modeling within this package can be fitted using the following form?

|coxFit.DF <- coxph(Surv(start, stop, event) ~ sex + age + group, data, x = TRUE) |

Because when I fitted the above Cox model with the difference between the times they were followed |time = stop-start|:

|coxFit.DF <- coxph(Surv(time, event) ~ sex + age + group, data, x = TRUE) |

There were errors telling me that some patients had observations after the events. And when I use the |Surv(start, stop, event)| one, error message also came up and ask me to refit the Cox model using |x = TRUE| argument which I had already added.

One of the flexibility of the time-dependent Cox regression model is that patients are allowed to enter the study from different start points, If there is also a way to allow for this in the Joint Modeling framework, I would like to see how and it will be greatly appreciated.

Many thanks.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdrizopoulos%2FJM%2Fissues%2F13%3Femail_source%3Dnotifications%26email_token%3DADE7TT6CD3KRU3PO5YZI2XTPWJFL3A5CNFSM4HN7CRR2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GUUX7EQ&data=02%7C01%7Cd.rizopoulos%40erasmusmc.nl%7C56bd8c5bf49e49eac31f08d6dcf272ed%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C636939331201721564&sdata=M4Y%2FsXotdtHzuNQEtmVHN8rEzRlh7BH7RwfnkFoUaUQ%3D&reserved=0, or mute the thread https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADE7TT6LWGUJ53H2MYMJNVLPWJFL3ANCNFSM4HN7CRRQ&data=02%7C01%7Cd.rizopoulos%40erasmusmc.nl%7C56bd8c5bf49e49eac31f08d6dcf272ed%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C636939331201731568&sdata=9RDYt9uk%2F4nmwnuF1cjs1umwcde7m5JoliVaa%2FdUT9I%3D&reserved=0.

-- Dimitris Rizopoulos Professor of Biostatistics Department of Biostatistics Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web (personal): http://www.drizopoulos.com/ Web (work): http://www.erasmusmc.nl/biostatistiek/ Blog: http://iprogn.blogspot.nl/

winterwang commented 5 years ago

Hi, I have tried the following with what you have suggested (twice). There is no patient with 0 follow-up time. Missing data were also excluded from the beginning.

Because in your book, when comparing with the time-dependent model, JM should have given us a less underestimated result. However, it seems that it was not the case in my data. The time-dependent model which used a biomarker measured at each visit paid by patients as the main predictor is as follows:

Call:
coxph(formula = Surv(as.numeric(START0), as.numeric(LaboDate), 
    MACE) ~ logMarker + BASE_AGE + SEX, data = ANA_df, 
    ties = "breslow")

  n= 689548, number of events= 1959 

                         coef exp(coef)  se(coef)       z Pr(>|z|)    
logMarker            0.638440  1.893525  0.138547   4.608 4.06e-06 ***
BASE_AGE             0.024577  1.024881  0.002027  12.125  < 2e-16 ***
SEX                 -0.535464  0.585398  0.050854 -10.530  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

                    exp(coef) exp(-coef) lower .95 upper .95
I(log(RESULT_MEAN))    1.8935     0.5281    1.4432    2.4843
BASE_AGE               1.0249     0.9757    1.0208    1.0290
SEX                    0.5854     1.7082    0.5299    0.6468

Concordance= 0.605  (se = 0.006 )
Likelihood ratio test= 252.6  on 3 df,   p=<2e-16
Wald test            = 238.8  on 3 df,   p=<2e-16
Score (logrank) test = 239.5  on 3 df,   p=<2e-16

So I was expecting to see a stronger relationship between this biomarker and the hazard of the event.

However, when I fitted the mixed effect linear model and the Cox regression model with your approach, I found that the association is gone:

lmeFit.DF <- lme(logMarker ~ obstime + SEX + BASE_AGE, 
                   random = ~ obstime | PatientId, data = ANA_df)

coxFit.DF <- coxph(Surv(Time, MACE) ~ SEX + BASE_AGE, data = DF_1strow, 
        x = TRUE, model = TRUE)

jointFit.DF <- jointModel(lmeFit.DF, coxFit.DF, timeVar = "obstime", 
                            method = "piecewise-PH-aGH")
# Call:
#   jointModel(lmeObject = lmeFit.DF, survObject = coxFit.DF, timeVar = "obstime", 
#              method = "piecewise-PH-aGH")
# 
# Data Descriptives:
#   Longitudinal Process        Event Process
# Number of Observations: 689548    Number of Events: 1959 (3.7%)
# Number of Groups: 52511
# 
# Joint Model Summary:
#   Longitudinal Process: Linear mixed-effects model
# Event Process: Relative risk model with piecewise-constant
# baseline risk function
# Parameterization: Time-dependent 
# 
# log.Lik      AIC      BIC
# 589034 -1178032 -1177872
# 
# Variance Components:
#   StdDev    Corr
# (Intercept)  0.1511  (Intr)
# obstime      0.0001 -0.4542
# Residual     0.0853        
# 
# Coefficients:
#   Longitudinal Process
#           Value Std.Err  z-value p-value
# (Intercept)  2.0248  0.0037 552.4671 <0.0001
# obstime      0.0000  0.0000 -28.0565 <0.0001
# SEX          0.0066  0.0013   4.9324 <0.0001
# BASE_AGE    -0.0014  0.0000 -27.9929 <0.0001
# 
# Event Process
#           Value Std.Err  z-value p-value
# SEX        -0.5044  0.0507  -9.9453 <0.0001
# BASE_AGE    0.0312  0.0022  14.0130 <0.0001
# Assoct      0.3949  0.4403   0.8969  0.3698
# log(xi.1) -13.8911  0.9143 -15.1927        
# log(xi.2) -13.1572  0.9062 -14.5189        
# log(xi.3) -12.9483  0.9031 -14.3368        
# log(xi.4) -12.6816  0.9021 -14.0575        
# log(xi.5) -12.4629  0.9027 -13.8065        
# log(xi.6) -12.1712  0.9070 -13.4184        
# log(xi.7)  -9.6649  0.9121 -10.5962        
# 
# Integration:
#   method: (pseudo) adaptive Gauss-Hermite
# quadrature points: 3 
# 
# Optimization:
#   Convergence: 0 

Do you have any suggestion on how to interpret this result? Maybe I can try adding more markers since they were measured at the same longitudinal time points.

drizopoulos commented 5 years ago

Indeed most often the bias in Cox is downwards but not always.

From: Chaochen Wang notifications@github.com<mailto:notifications@github.com> Date: Friday, 24 May 2019, 9:13 AM To: drizopoulos/JM JM@noreply.github.com<mailto:JM@noreply.github.com> Cc: D. Rizopoulos d.rizopoulos@erasmusmc.nl<mailto:d.rizopoulos@erasmusmc.nl>, Comment comment@noreply.github.com<mailto:comment@noreply.github.com> Subject: Re: [drizopoulos/JM] Can the joint modeling frame work accept patients with different start time ? (#13)

Hi, I have tried the following with what you have suggested (twice). There is no patient with 0 follow-up time. Missing data were also excluded from the beginning.

Because in your book, when comparing with the time-dependent model, JM should have given us a less underestimated result. However, it seems that it was not the case in my data. The time-dependent model which used a biomarker measured at each visit paid by patients as the main predictor is as follows:

Call:

coxph(formula = Surv(as.numeric(START0), as.numeric(LaboDate),

MACE) ~ logMarker + BASE_AGE + SEX, data = ANA_df,

ties = "breslow")

n= 689548, number of events= 1959

                     coef exp(coef)  se(coef)       z Pr(>|z|)

logMarker 0.638440 1.893525 0.138547 4.608 4.06e-06 ***

BASE_AGE 0.024577 1.024881 0.002027 12.125 < 2e-16 ***

SEX -0.535464 0.585398 0.050854 -10.530 < 2e-16 ***


Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

                exp(coef) exp(-coef) lower .95 upper .95

I(log(RESULT_MEAN)) 1.8935 0.5281 1.4432 2.4843

BASE_AGE 1.0249 0.9757 1.0208 1.0290

SEX 0.5854 1.7082 0.5299 0.6468

Concordance= 0.605 (se = 0.006 )

Likelihood ratio test= 252.6 on 3 df, p=<2e-16

Wald test = 238.8 on 3 df, p=<2e-16

Score (logrank) test = 239.5 on 3 df, p=<2e-16

So I was expecting to see a stronger relationship between this biomarker and the hazard of the event.

However, when I fitted the mixed effect linear model and the Cox regression model with your approach, I found that the association is gone:

lmeFit.DF <- lme(logMarker ~ obstime + SEX + BASE_AGE,

               random = ~ obstime | PatientId, data = ANA_df)

coxFit.DF <- coxph(Surv(Time, MACE) ~ SEX + BASE_AGE, data = DF_1strow,

        x = TRUE, model = TRUE)

jointFit.DF <- jointModel(lmeFit.DF, coxFit.DF, timeVar = "obstime",

                        method = "piecewise-PH-aGH")

Call:

jointModel(lmeObject = lmeFit.DF, survObject = coxFit.DF, timeVar = "obstime",

method = "piecewise-PH-aGH")

#

Data Descriptives:

Longitudinal Process Event Process

Number of Observations: 689548 Number of Events: 1959 (3.7%)

Number of Groups: 52511

#

Joint Model Summary:

Longitudinal Process: Linear mixed-effects model

Event Process: Relative risk model with piecewise-constant

baseline risk function

Parameterization: Time-dependent

#

log.Lik AIC BIC

589034 -1178032 -1177872

#

Variance Components:

StdDev Corr

(Intercept) 0.1511 (Intr)

obstime 0.0001 -0.4542

Residual 0.0853

#

Coefficients:

Longitudinal Process

Value Std.Err z-value p-value

(Intercept) 2.0248 0.0037 552.4671 <0.0001

obstime 0.0000 0.0000 -28.0565 <0.0001

SEX 0.0066 0.0013 4.9324 <0.0001

BASE_AGE -0.0014 0.0000 -27.9929 <0.0001

#

Event Process

Value Std.Err z-value p-value

SEX -0.5044 0.0507 -9.9453 <0.0001

BASE_AGE 0.0312 0.0022 14.0130 <0.0001

Assoct 0.3949 0.4403 0.8969 0.3698

log(xi.1) -13.8911 0.9143 -15.1927

log(xi.2) -13.1572 0.9062 -14.5189

log(xi.3) -12.9483 0.9031 -14.3368

log(xi.4) -12.6816 0.9021 -14.0575

log(xi.5) -12.4629 0.9027 -13.8065

log(xi.6) -12.1712 0.9070 -13.4184

log(xi.7) -9.6649 0.9121 -10.5962

#

Integration:

method: (pseudo) adaptive Gauss-Hermite

quadrature points: 3

#

Optimization:

Convergence: 0

Do you have any suggestion on how to interpret this result? Maybe I can try adding more markers since they were measured at the same longitudinal time points.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdrizopoulos%2FJM%2Fissues%2F13%3Femail_source%3Dnotifications%26email_token%3DADE7TT3CP6GEPRMFGROI773PW6ITBA5CNFSM4HN7CRR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWELSDI%23issuecomment-495499533&data=02%7C01%7Cd.rizopoulos%40erasmusmc.nl%7C6463bd1fbd464d5e945b08d6e017612a%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C636942788350149318&sdata=UxZpQqdy25tbNWmB8zbkBOkubmmbgyrNJtY39%2BZfhnk%3D&reserved=0, or mute the threadhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADE7TT7YAE6XR4244T7VAHLPW6ITBANCNFSM4HN7CRRQ&data=02%7C01%7Cd.rizopoulos%40erasmusmc.nl%7C6463bd1fbd464d5e945b08d6e017612a%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C636942788350159311&sdata=tIQyVDzeIMflKNS%2BfgufwDKkr8a%2FDyDFToa9ybqwuFs%3D&reserved=0.

winterwang commented 5 years ago

OK, thanks for your help. So it was not because of any error in my coding?