Open buttrey opened 6 years ago
Is the story that ranger is not (yet) set up to handle this flavor of Surv() object?
Exactly. For now we have to check the Surv
object and produce an error for your example.
Is there any literature on RF for left censoring or truncation?
Hi Marvin, Thanks for your speedy and helpful reply.
There is literature on fitting single trees for left-truncation. This paper:
Fu and Simonoff (2017), “Survival trees for left-truncated and right-censored data, with application to time-varying covariate data,” Biostatistics 18 (2), 352-369
describes their approach, which is implemented in the LTRCtrees package at CRAN (and see this vignette: https://cran.r-project.org/web/packages/LTRCtrees/vignettes/LTRCtrees.html) . This package builds on rpart and partykit.
But I haven’t found anything on random forests for survival trees with left-truncation.
Have fun, Sam Buttrey
From: Marvin N. Wright [mailto:notifications@github.com] Sent: Monday, June 25, 2018 6:18 AM To: imbs-hl/ranger Cc: Buttrey, Samuel (Sam) (CIV); Author Subject: Re: [imbs-hl/ranger] Handling left-truncated or left-censored data? (#328)
Is the story that ranger is not (yet) set up to handle this flavor of Surv() object? Exactly. For now we have to check the Surv object and produce an error for your example.
Is there any literature on RF for left censoring or truncation?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/imbs-hl/ranger/issues/328#issuecomment-399948411, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AJqqtUfUF8UoFcyYxm3DcqpvWXs5oEuTks5uAOMTgaJpZM4UbvMe.
Thanks. I probably won't have the time to implement it soon. However, I'm happy to help if someone wants to do so.
Sure, I understand. Thanks for the replies.
From: Marvin N. Wright [mailto:notifications@github.com] Sent: Tuesday, June 26, 2018 1:35 AM To: imbs-hl/ranger Cc: Buttrey, Samuel (Sam) (CIV); Author Subject: Re: [imbs-hl/ranger] Handling left-truncated or left-censored data? (#328)
Thanks. I probably won't have the time to implement it soon. However, I'm happy to help if someone wants to do so.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/imbs-hl/ranger/issues/328#issuecomment-400227079, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AJqqtWBkiCq1jHfwM6hYtinAKBYEFKg1ks5uAfI7gaJpZM4UbvMe.
Hello,
Just letting anyone who wants to fit RF to left-truncated data know that the inability to explicitly take into account left-truncation can be offset by instead modeling on time-on-study and including the left-truncation time as a covariable.
With this approach, the original example (ranger(Surv(time1, time2, event) ~ age + sex + log.bili, data = pbcseq)
) would become
ranger(Surv(time2 - time1, event) ~ time1 + age + sex + log.bili, data = pbcseq)
.
Although this may seem irksome when coming from parametric or semi-parametric models, in practice it gives reasonable results. In fact, in medical literature, most studies adapt this approach even when working with parametric models.
All in all, the inability to model left-truncation seems to be a minor hindrance, and that's the reason why no one bothers to implement it.
PS Thanks for the great package !
Hi. I have left-truncated data. Normally I would express this by calling Surv (time1, time2, event, type = "counting"), where time1 is the time at which I first saw this observation, time2 is the event or censoring time, and event describes the event. [I'm hoping that type = "counting" specifies that time1 is a truncation time, rather than a left-censoring time.] But although ranger runs with this setup, it produces prediction times that are uniformly 1. Consider the example on the last line of the help page for the pbcseq data set in the survival library. With a little modification, like explicitly adding a column named log.bili, we see this:
rf <- ranger(Surv(time1, time2, event) ~ age + sex + log.bili, data = pbcseq) but then all (predict (rf, data = pbcseq)$preds == 1) # produces TRUE
Every prediction is always 1. (I get the same rtesult with type="counting.") That seems off. Is the story that ranger is not (yet) set up to handle this flavor of Surv() object?
Thanks, Sam Buttrey