pharmaverse / ggsurvfit

http://www.danieldsjoberg.com/ggsurvfit/
Other
67 stars 19 forks source link

include `survfit(...)$start.time` information when plotting #192

Closed bethatkinson closed 2 months ago

bethatkinson commented 3 months ago

I have an instance where my call to ggcuminc includes 4 cause-specific deaths and 4 strata. I'm including or not the linetypes, but I would like the color coding to be applied to the endpoints, not the strata.

ggcuminc(flsurv1, outcome=c("Circulatory","Neoplams","Respiratory","Other"), theme=theme_classic()) + coord_cartesian(xlim=c(50,100)) + xlab("Age") + ylab("P(death)") + theme_classic() + facet_wrap(~strata, ncol = 2)

I also noticed that the xlim starting at a non-zero value is a bit clunky - it would be nice if the function would pay attention to fit$start.time.

flsurv1 <- survfit(Surv(age, age2, dtype) ~ group + sex, data = flc2, id = id, start.time = 50)

flsurv1$start.time [1] 50

Code I'm using to create the figure I want

Using survfit0 to create a "survfit" object that "tidy" understands

flsurv1b <- survfit0(flsurv1) dat <- tidy(flsurv1b) dat$state <- factor(dat$state, levels = c('(s0)', levels(flc2$dtype)[2:5])) dat$group <- factor(dat$strata, levels = names(flsurv1$strata)[c(1,3,2,4)]) ggplot(dat[dat$state != '(s0)',], aes(time, estimate, color = state, linetype = state)) + geom_step() + facet_wrap(~group, ncol = 2) + xlab("Age") + ylab("P(death)") + theme_classic()

ddsjoberg commented 3 months ago

Hi @bethatkinson !! I have a global option that switches the color/linetype coding. Will this work for you? https://www.danieldsjoberg.com/ggsurvfit/reference/ggsurvfit_options.html

bethatkinson commented 3 months ago

@ddsjoberg - Yes, this seems to fix the issue - thanks.

ddsjoberg commented 3 months ago

Great! The global option is not the best solution, and really just trying it out before it gets further integrated into the function call.

Regarding your other important point about the starting point. Would you mind putting together a small reproducible example showing the bad default in a new issue and I can address it when I have the bandwidth? Thanks!!

bethatkinson commented 3 months ago

Will do

From: Daniel Sjoberg @.> Sent: Tuesday, March 19, 2024 3:59 PM To: pharmaverse/ggsurvfit @.> Cc: Atkinson, Beth J., M.S. @.>; Mention @.> Subject: [EXTERNAL] Re: [pharmaverse/ggsurvfit] color/line coding when multiple competing risks and multiple groups (Issue #192)

Great! The global option is not the best solution, and really just trying it out before it gets further integrated into the function call.

Regarding your other important point about the starting point. Would you mind putting together a small reproducible example showing the bad default in a new issue and I can address it when I have the bandwidth? Thanks!!

- Reply to this email directly, view it on GitHubhttps://github.com/pharmaverse/ggsurvfit/issues/192#issuecomment-2008122592, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACWQG52FE572JXK6OZFPRSDYZCRIBAVCNFSM6AAAAABE4KQXC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBYGEZDENJZGI. You are receiving this because you were mentioned.Message ID: @.**@.>>

bethatkinson commented 3 months ago

simple example - real project happens more using the age scale, but this shows what happens

library(survival) library(ggsurvfit)

aml$start <- 100 aml$stop <- aml$time/12 + 100

fit <- survfit2(Surv(start,stop,status)~x, data=aml, start.time=100) fit$start.time plot(fit) ## plot starts at 100 on x-axis ggsurvfit(fit) ## plot starts at 0 on x-axis

From: Daniel Sjoberg @.> Sent: Tuesday, March 19, 2024 3:59 PM To: pharmaverse/ggsurvfit @.> Cc: Atkinson, Beth J., M.S. @.>; Mention @.> Subject: [EXTERNAL] Re: [pharmaverse/ggsurvfit] color/line coding when multiple competing risks and multiple groups (Issue #192)

Great! The global option is not the best solution, and really just trying it out before it gets further integrated into the function call.

Regarding your other important point about the starting point. Would you mind putting together a small reproducible example showing the bad default in a new issue and I can address it when I have the bandwidth? Thanks!!

- Reply to this email directly, view it on GitHubhttps://github.com/pharmaverse/ggsurvfit/issues/192#issuecomment-2008122592, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACWQG52FE572JXK6OZFPRSDYZCRIBAVCNFSM6AAAAABE4KQXC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBYGEZDENJZGI. You are receiving this because you were mentioned.Message ID: @.**@.>>

bethatkinson commented 3 months ago

Actually, this isn't working quite right.

library(survival)

Create event (factor), only use subjects with some amount of follow-up

flc2 <- subset(flchain, futime > 0) dtype <- with(flc2, ifelse(is.na(chapter), 0, ifelse(chapter == "Blood" | chapter == "Circulatory", 1, ifelse(chapter == "Neoplasms", 2, ifelse(chapter == "Respiratory", 3, 4))))) flc2$dtype <- factor(dtype, 0:4, c("censor", "Circulatory", "Neoplams", "Respiratory", "Other"))

covariate of interest (FLC group)

flc2$group <- factor(1*(flc2$flc.grp == 10), levels = c(0,1), labels = c('Low FLC','High FLC'))

flc2$id <- 1:nrow(flc2) flc2$age2 <- flc2$age + flc2$futime/365.25

library(ggsurvfit) flsurv1 <- survfit2(Surv(age, age2, dtype) ~ group + sex, data = flc2, id = id, start.time = 50) options("ggsurvfit.switch-color-linetype" = TRUE)

ggcuminc(flsurv1, outcome=c("Circulatory","Neoplams","Respiratory","Other"), theme=theme_classic()) + coord_cartesian(xlim=c(50,100)) + xlab("Age") + ylab("P(death)") + facet_wrap(~strata, ncol = 2)

@.***

From: Daniel Sjoberg @.> Sent: Tuesday, March 19, 2024 10:41 AM To: pharmaverse/ggsurvfit @.> Cc: Atkinson, Beth J., M.S. @.>; Mention @.> Subject: [EXTERNAL] Re: [pharmaverse/ggsurvfit] color/line coding when multiple competing risks and multiple groups (Issue #192)

Hi @bethatkinsonhttps://github.com/bethatkinson !! I have a global option that switches the color/linetype coding. Will this work for you? https://www.danieldsjoberg.com/ggsurvfit/reference/ggsurvfit_options.html

- Reply to this email directly, view it on GitHubhttps://github.com/pharmaverse/ggsurvfit/issues/192#issuecomment-2007525608, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACWQG5Z6WKCZ7L7OZFX6COTYZBMBTAVCNFSM6AAAAABE4KQXC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBXGUZDKNRQHA. You are receiving this because you were mentioned.Message ID: @.**@.>>

ddsjoberg commented 2 months ago

Thank you @bethatkinson for bringing this issue up!

If a user specifies start.time we'll use that. Most cases an implicit start time is 0. Do you have a suggestion on the situation where there are negative times?

library(survival)

survfit(Surv(time, status) ~ 1, lung) |> plot()

survfit(Surv(time - 500, status) ~ 1, lung) |> plot()

Created on 2024-04-03 with reprex v2.1.0

bethatkinson commented 2 months ago

Negative times are perfectly ok, especially when working on different time scales. Note from Terry on the issue (someone else asked recently)

"Negative time has always been legal, in coxph or survfit. If someone wants to use "days since Jan 1 2024" as their time scale, and there are people who started in December, coxph doesn't care.

In fact, this very issue is why I decided, long ago, not to return "time 0" as part of the survival curve: if there are negative times, then I don't know when the true "start" is. For the data set you show (and I am completely making this up) perhaps everyone started at -7 for 'conditioning' and treatment starts on day 0 by definition. But then someone died on day -2: the PI would have thrown them out but their statistician said you can't due to intent to treat. For a survival curve post fit, you would need to add start.time = -7 to get the plot aligned the way you want.

I don't remember any of the details, but I think that a data set with negative times showed up sometime early in my career and forced me to think about it."

From: Daniel Sjoberg @.> Sent: Wednesday, April 03, 2024 5:13 PM To: pharmaverse/ggsurvfit @.> Cc: Atkinson, Beth J., M.S. @.>; Mention @.> Subject: [EXTERNAL] Re: [pharmaverse/ggsurvfit] include survfit(...)$start.time information when plotting (Issue #192)

Thank you @bethatkinsonhttps://github.com/bethatkinson for bringing this issue up!

Do you have a suggestion on the situation where there are negative times?

library(survival)

survfit(Surv(time, status) ~ 1, lung) |> plot()

[https://camo.githubusercontent.com/3cbeb1077da83cc2ee814c02d300eb89745437ef638a553528afd8ef769b38b8/68747470733a2f2f692e696d6775722e636f6d2f5853676e73716b2e706e67]https://camo.githubusercontent.com/3cbeb1077da83cc2ee814c02d300eb89745437ef638a553528afd8ef769b38b8/68747470733a2f2f692e696d6775722e636f6d2f5853676e73716b2e706e67

survfit(Surv(time - 500, status) ~ 1, lung) |> plot()

[https://camo.githubusercontent.com/673ab1636b8ca0ba190e89dda80b8218a481415e336ece29aced4a5568b845e4/68747470733a2f2f692e696d6775722e636f6d2f664b66554c7a682e706e67]https://camo.githubusercontent.com/673ab1636b8ca0ba190e89dda80b8218a481415e336ece29aced4a5568b845e4/68747470733a2f2f692e696d6775722e636f6d2f664b66554c7a682e706e67

Created on 2024-04-03 with reprex v2.1.0https://reprex.tidyverse.org/

- Reply to this email directly, view it on GitHubhttps://github.com/pharmaverse/ggsurvfit/issues/192#issuecomment-2035692607, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACWQG5ZCKJB3V2ZMFL64NQ3Y3R5IPAVCNFSM6AAAAABE4KQXC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZVGY4TENRQG4. You are receiving this because you were mentioned.Message ID: @.**@.>>

ddsjoberg commented 2 months ago

Thanks Beth! Agree negative times are 100% ok in some circumstances. Perhaps I'll implement like this:

  1. If start.time is specified, use it.
  2. Otherwise, start time will be assumed to be 0
  3. Perhaps through a note if negative times are present AND start.time not specified that they should indicate the proper starting time.

I think that sounds reasonable. What do you think?

bethatkinson commented 2 months ago

I could see that working. In survfit0, which Terry uses to add in the initial time,

if (missing(start.time) || is.null(start.time)) { if (is.null(x$start.time)) start.time <- min(c(0, x$time)) else start.time <- x$start.time }

From: Daniel Sjoberg @.> Sent: Thursday, April 04, 2024 9:51 AM To: pharmaverse/ggsurvfit @.> Cc: Atkinson, Beth J., M.S. @.>; Mention @.> Subject: [EXTERNAL] Re: [pharmaverse/ggsurvfit] include survfit(...)$start.time information when plotting (Issue #192)

Thanks Beth! Agree negative times are 100% ok in some circumstances. Perhaps I'll implement like this:

  1. If start.time is specified, use it.
  2. Otherwise, start time will be assumed to be 0
  3. Perhaps through a note if negative times are present AND start.time not specified that they should indicate the proper starting time.

I think that sounds reasonable. What do you think?

- Reply to this email directly, view it on GitHubhttps://github.com/pharmaverse/ggsurvfit/issues/192#issuecomment-2037425710, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACWQG54FBK3BXLB2URXJFTTY3VSEZAVCNFSM6AAAAABE4KQXC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZXGQZDKNZRGA. You are receiving this because you were mentioned.Message ID: @.**@.>>