kassambara / survminer

Survival Analysis and Visualization
https://rpkgs.datanovia.com/survminer/
506 stars 162 forks source link

ggsurvplot risk.table error in labels #464

Open billudada78 opened 4 years ago

billudada78 commented 4 years ago

Expected behavior

I expected that a variable captured as a factor should preserve the proper linking between factor level and the associated row within the risk table. I have observed that the rows within the risk table were labelled with incorrect factor levels which could cause serious issues for folks who do not spot the incorrect levels being placed against the risk rows (i.e. "Level A" could be listed alongside the row for Level B data). In the code below km_fit1 and km_fit2 should result in similar plots with perhaps a different sort order on the group:

measurements_surv_model$initial_group_c <- as.character(measurements_surv_model$initial_group)

km_fit1 <- survfit(Surv(measurements_surv_model$duration, measurements_surv_model$censor)~measurements_surv_model$initial_group)

km_fit2 <- survfit(Surv(measurements_surv_model$duration, measurements_surv_model$censor)~measurements_surv_model$initial_group_c)

ggsurvplot(km_fit1, data = measurements_surv_model, risk.table = TRUE, break.time.by = 5, linetype = "strata")

ggsurvplot(km_fit2, data = measurements_surv_model, risk.table = TRUE, break.time.by = 5, linetype = "strata")

Actual behavior

Looking at the plots below, you can see that the figures are fine and consistent, but that the labels in risk tables have been "randomly" sorted in the case where the factor variable, initial_group, was used instead of its character equivelent, initial_group_c. The group "UM", for example, has only one observation and this group is wrongly listed alongside "zorgmedewerkers" in the first plot. Similarly, the other groups are wrongly placed in some random order.

Using factor version of the variable (km_fit1):

image

Using character version of the variable (km_fit2):

image

session_info()

# please paste here the result of
- Session info ---------------------------------------------------------------------------------------------------------------
 setting  value                       
 version  R version 3.6.3 (2020-02-29)
 os       Windows 10 x64              
 system   x86_64, mingw32             
 ui       RStudio                     
 language (EN)                        
 collate  English_United States.1252  
 ctype    English_United States.1252  
 tz       Europe/Berlin               
 date     2020-04-20                  

- Packages -------------------------------------------------------------------------------------------------------------------
 package     * version date       lib source        
 assertthat    0.2.1   2019-03-21 [1] CRAN (R 3.6.1)
 backports     1.1.5   2019-10-02 [1] CRAN (R 3.6.1)
 broom         0.5.5   2020-02-29 [1] CRAN (R 3.6.3)
 callr         3.4.2   2020-02-12 [1] CRAN (R 3.6.3)
 cli           2.0.2   2020-02-28 [1] CRAN (R 3.6.3)
 colorspace    1.4-1   2019-03-18 [1] CRAN (R 3.6.1)
 crayon        1.3.4   2017-09-16 [1] CRAN (R 3.6.1)
 data.table    1.12.8  2019-12-09 [1] CRAN (R 3.6.3)
 desc          1.2.0   2018-05-01 [1] CRAN (R 3.6.1)
 devtools      2.2.2   2020-02-17 [1] CRAN (R 3.6.3)
 digest        0.6.25  2020-02-23 [1] CRAN (R 3.6.3)
 dplyr       * 0.8.3   2019-07-04 [1] CRAN (R 3.6.1)
 ellipsis      0.3.0   2019-09-20 [1] CRAN (R 3.6.1)
 fansi         0.4.1   2020-01-08 [1] CRAN (R 3.6.3)
 fs            1.3.1   2019-05-06 [1] CRAN (R 3.6.3)
 generics      0.0.2   2018-11-29 [1] CRAN (R 3.6.2)
 ggplot2     * 3.3.0   2020-03-05 [1] CRAN (R 3.6.3)
 ggpubr      * 0.2.5   2020-02-13 [1] CRAN (R 3.6.3)
 ggsignif      0.6.0   2019-08-08 [1] CRAN (R 3.6.2)
 glue          1.3.1   2019-03-12 [1] CRAN (R 3.6.1)
 gridExtra     2.3     2017-09-09 [1] CRAN (R 3.6.1)
 gtable        0.3.0   2019-03-25 [1] CRAN (R 3.6.1)
 highr         0.8     2019-03-20 [1] CRAN (R 3.6.2)
 hms           0.5.3   2020-01-08 [1] CRAN (R 3.6.3)
 km.ci         0.5-2   2009-08-30 [1] CRAN (R 3.6.2)
 KMsurv        0.1-5   2012-12-03 [1] CRAN (R 3.6.0)
 knitr         1.28    2020-02-06 [1] CRAN (R 3.6.3)
 labeling      0.3     2014-08-23 [1] CRAN (R 3.6.0)
 lattice       0.20-40 2020-02-19 [2] CRAN (R 3.6.3)
 lifecycle     0.2.0   2020-03-06 [1] CRAN (R 3.6.3)
 magrittr    * 1.5     2014-11-22 [1] CRAN (R 3.6.1)
 Matrix        1.2-18  2019-11-27 [2] CRAN (R 3.6.3)
 memoise       1.1.0   2017-04-21 [1] CRAN (R 3.6.1)
 munsell       0.5.0   2018-06-12 [1] CRAN (R 3.6.1)
 nlme          3.1-144 2020-02-06 [2] CRAN (R 3.6.3)
 pillar        1.4.3   2019-12-20 [1] CRAN (R 3.6.3)
 pkgbuild      1.0.6   2019-10-09 [1] CRAN (R 3.6.1)
 pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 3.6.1)
 pkgload       1.0.2   2018-10-29 [1] CRAN (R 3.6.1)
 prettyunits   1.1.1   2020-01-24 [1] CRAN (R 3.6.3)
 processx      3.4.2   2020-02-09 [1] CRAN (R 3.6.3)
 ps            1.3.2   2020-02-13 [1] CRAN (R 3.6.3)
 purrr         0.3.3   2019-10-18 [1] CRAN (R 3.6.3)
 qwraps2     * 0.4.2   2019-12-02 [1] CRAN (R 3.6.2)
 R6            2.4.1   2019-11-12 [1] CRAN (R 3.6.3)
 Rcpp          1.0.2   2019-07-25 [1] CRAN (R 3.6.1)
 readr       * 1.3.1   2018-12-21 [1] CRAN (R 3.6.2)
 remotes       2.1.1   2020-02-15 [1] CRAN (R 3.6.3)
 rlang         0.4.5   2020-03-01 [1] CRAN (R 3.6.3)
 rprojroot     1.3-2   2018-01-03 [1] CRAN (R 3.6.1)
 rstudioapi    0.11    2020-02-07 [1] CRAN (R 3.6.3)
 scales        1.0.0   2018-08-09 [1] CRAN (R 3.6.1)
 sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 3.6.1)
 survival    * 3.1-8   2019-12-03 [2] CRAN (R 3.6.3)
 survminer   * 0.4.6   2019-09-03 [1] CRAN (R 3.6.2)
 survMisc      0.5.5   2018-07-05 [1] CRAN (R 3.6.2)
 testthat      2.3.2   2020-03-02 [1] CRAN (R 3.6.3)
 tibble        2.1.3   2019-06-06 [1] CRAN (R 3.6.1)
 tidyr         1.0.2   2020-01-24 [1] CRAN (R 3.6.3)
 tidyselect    0.2.5   2018-10-11 [1] CRAN (R 3.6.1)
 usethis       1.5.1   2019-07-04 [1] CRAN (R 3.6.1)
 utf8          1.1.4   2018-05-24 [1] CRAN (R 3.6.1)
 vctrs         0.2.3   2020-02-20 [1] CRAN (R 3.6.3)
 withr         2.1.2   2018-03-15 [1] CRAN (R 3.6.1)
 xfun          0.12    2020-01-13 [1] CRAN (R 3.6.3)
 xtable        1.8-4   2019-04-21 [1] CRAN (R 3.6.1)
 zoo           1.8-6   2019-05-28 [1] CRAN (R 3.6.1)

[1] C:/Users/xxxxxxx/Documents/R/win-library/3.6
[2] C:/Users/xxxxxxx/Documents/R/R-3.6.3/library
JLGlass commented 4 years ago

I have the same issue with a categorical variable. I get the same error regardless of whether it is a character column or factor. However, if I take out the risk table I don't get the error.

For example:

km.fit = survfit(Surv(os.months, os.status) ~ mutation.class, data=dt.survival) ggsurvplot(km.fit, risk.table=T)

Warning message: Vectorized input to element_text() is not officially supported. Results may be unexpected or may change in future versions of ggplot2.

But no error with this: km.fit = survfit(Surv(os.months, os.status) ~ mutation.class, data=dt.survival) ggsurvplot(km.fit, risk.table=F)

I suspect it has something to do with the survminer code handling the labels for the risk table.

If it helps, I'd suggest following the ggplot2 paradigm of allowing named color vectors.

chrisleitzinger commented 3 years ago

Hi, I have the same problem when leaving the label and it is worse when removing the label and/or using linetype. Screen Shot 2021-07-09 at 10 50 49 AM (2)