kassambara / survminer

Survival Analysis and Visualization
https://rpkgs.datanovia.com/survminer/
509 stars 164 forks source link

Error with `ggadjustedcurves()` : grouping factor has to be included in the coxph model #623

Open AlexisZr opened 1 year ago

AlexisZr commented 1 year ago

Expected behavior

As explicitly noted in the survminer cheatsheet:

is not necessary to include the grouping factor in the Cox model

Actual behavior

Despite following the exemplified information presented in the cheatsheet, the ggadjustedcurves() function exclusively generates a solitary survival curve without taking into account the provided grouping factor (which is not included in the cox model). To ascertain the veracity of the issue, I replicated the aforementioned example from the cheatsheet, which once more yielded solely one survival curve as opposed to the illustrated four.

Steps to reproduce the problem


lung$sex <- ifelse(lung$sex == 1, 
"Male", "Female")
fit <- coxph(Surv(time, status) ~ 
sex + ph.ecog + age +
strata(sex), 
data = lung)

lung$age3 <- cut(lung$age, c(35,55,65,85)) ggadjustedcurves(fit, data=lung, variable="age3")

markhwhiteii commented 1 year ago

+1 to this. The cheat sheet—and the README—both mention this is actually supposed to be done with the ggcoxadjustedcurves function, but that function is nowhere in the repository and thus not exported when downloading either from CRAN or GitHub.

EDIT: Ah, searching the NEWS, I see that it was deprecated, per #229. For what it's worth, when I run it on my data (unfortunately, I can't share the dataset), I get an error:

Error in if (xi > xj) 1L else -1L : missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In xtfrm.data.frame(x) : cannot xtfrm data frames
2: In Ops.factor(xi, xj) : ‘>’ not meaningful for factors
markhwhiteii commented 1 year ago

@AlexisZr I found a workaround where you use the predict method:

# fit model
fit <- coxph(Surv(time, status) ~ sex + ph.ecog + age, data = lung)

# make prediction data
ages <- c(35, 55, 65, 85)
times <- min(lung$time):max(lung$time)
fit_pred <- data.frame(
  age = rep(ages, each = length(times)), 
  time = rep(times, length(ages)),
  sex = "Female", # holding this
  ph.ecog = 1, # and this constant
  status = 0 # doesnt matter if 1 or 0, you get the same prediction
)

# predict survival
fit_pred$pred <- predict(fit, fit_pred, type = "survival")

# plot it
ggplot(fit_pred, aes(x = time, y = pred)) +
  geom_line(aes(color = factor(age), group = factor(age)))
AlexisZr commented 1 year ago

@markhwhiteii Deeply appreciated the workaround. Nonetheless, I manage to tackle the problem a while ago using the adjustedCurves Package (I highly recommend this package, as it offers a more straightforward approach )