[Bug]: `surv_time` doesn't work with data that only has events

averissimo commented 5 months ago

What happened?

Found a problem when creating a table with survival data where columns are split by a variable and one (or more) groups has only events.

Reproducible examples (via `rtables`, or using only `tern`)

lyt <- rtables::basic_table() |>
  rtables::split_cols_by(var = "ARM") |>
  tern::surv_time(
    vars = "AVAL",
    is_event = "is_event"
  )

my_anl <- tibble::tribble(
  ~AVAL, ~ARM, ~is_event,
  1,  "B",     FALSE,
  2,  "B",     TRUE,
  1,  "A",     TRUE,
  2,  "A",     TRUE
)

rtables::build_table(lyt = lyt, df = my_anl)
#> Error: Error applying analysis function (var - AVAL): missing value where TRUE/FALSE needed
#>  occured at (row) path: root

^{Created on 2024-06-21 with reprex v2.1.0}

Some possible test to avoid this regression in the future :-)

testthat::test_that("a_surv_time works with events only", {
  anl <- tibble::tribble(
    ~AVAL, ~ARM, ~is_event,
    1,  "A",     TRUE,
    2,  "A",     TRUE
  )

  testthat::expect_silent(
    tern::a_surv_time(anl, .var = "AVAL", is_event = "is_event")
  )
})
#> ── Error: a_surv_time works with events only ───────────────────────────────────
#> Error in `if (x_stats[["range"]][1] == rng_censor_lwr && x_stats[["range"]][2] == 
#>     rng_censor_upr) {
#>     cell_fns[[.labels[["range"]]]] <- "Censored observations: range minimum & maximum"
#> } else if (x_stats[["range"]][1] == rng_censor_lwr) {
#>     cell_fns[[.labels[["range"]]]] <- "Censored observation: range minimum"
#> } else if (x_stats[["range"]][2] == rng_censor_upr) {
#>     cell_fns[[.labels[["range"]]]] <- "Censored observation: range maximum"
#> }`: missing value where TRUE/FALSE needed
#> Backtrace:
#>     ▆
#>  1. ├─testthat::expect_silent(...)
#>  2. │ └─testthat:::quasi_capture(enquo(object), NULL, evaluate_promise) at testthat/R/expect-silent.R:21:3
#>  3. │   ├─testthat (local) .capture(...) at testthat/R/quasi-label.R:54:3
#>  4. │   │ ├─withr::with_output_sink(...) at testthat/R/evaluate-promise.R:34:3
#>  5. │   │ │ └─base::force(code)
#>  6. │   │ ├─base::withCallingHandlers(...) at testthat/R/evaluate-promise.R:34:3
#>  7. │   │ └─base::withVisible(code) at testthat/R/evaluate-promise.R:34:3
#>  8. │   └─rlang::eval_bare(quo_get_expr(.quo), quo_get_env(.quo)) at testthat/R/quasi-label.R:54:3
#>  9. └─tern::a_surv_time(anl, .var = "AVAL", is_event = "is_event") at rlang/R/eval.R:96:3
#> Error:
#> ! Test failed


testthat::test_that("a_surv_time works with censored data only", {
  anl <- tibble::tribble(
    ~AVAL, ~ARM, ~is_event,
    1,  "A",     FALSE,
    2,  "A",     FALSE
  )

  testthat::expect_silent(
    tern::a_surv_time(anl, .var = "AVAL", is_event = "is_event")
  )
})
#> Test passed 🎉

^{Created on 2024-06-21 with reprex v2.1.0}

Additional notes:

It works with a package manager snapshot from 2023-06-09 with these 3 versions:

formatters [0.5.0]
rtables [0.6.1]
tern [0.8.2]

sessionInfo()

R version 4.4.0 (2024-04-24)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_IE.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_IE.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_IE.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_IE.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Berlin
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

loaded via a namespace (and not attached):
 [1] styler_1.10.3     utf8_1.2.4        generics_0.1.3    tidyr_1.3.1       renv_1.0.7        stringi_1.8.4    
 [7] lattice_0.22-6    digest_0.6.35     magrittr_2.0.3    evaluate_0.24.0   grid_4.4.0        fastmap_1.2.0    
[13] R.oo_1.26.0       R.cache_0.16.0    Matrix_1.7-0      processx_3.8.4    R.utils_2.12.3    backports_1.5.0  
[19] survival_3.7-0    ps_1.7.6          purrr_1.0.2       fansi_1.0.6       scales_1.3.0      Rdpack_2.6       
[25] cli_3.6.2         rlang_1.1.4       rbibutils_2.2.16  R.methodsS3_1.8.2 munsell_0.5.1     splines_4.4.0    
[31] reprex_2.1.0      yaml_2.3.8        withr_3.0.0       tools_4.4.0       checkmate_2.3.1   dplyr_1.1.4      
[37] colorspace_2.1-0  ggplot2_3.5.1     formatters_0.5.8  broom_1.0.6       rtables_0.6.8     vctrs_0.6.5      
[43] R6_2.5.1          lifecycle_1.0.4   fs_1.6.4          callr_3.7.6       pkgconfig_2.0.3   clipr_0.8.0      
[49] pillar_1.9.0      gtable_0.3.5      glue_1.7.0        xfun_0.44         tibble_3.2.1      tidyselect_1.2.1 
[55] rstudioapi_0.16.0 knitr_1.47        htmltools_0.5.8.1 rmarkdown_2.27    compiler_4.4.0    tern_0.9.5

Relevant log output

No response

Code of Conduct

[X] I agree to follow this project's Code of Conduct.

Contribution Guidelines

[X] I agree to follow this project's Contribution Guidelines.

Security Policy

[X] I agree to follow this project's Security Policy.

Melkiades commented 5 months ago

The following works. I think it is a matter of having at least one is_event per split. Probably need to improve messaging though

lyt <- rtables::basic_table() |>
  rtables::split_cols_by(var = "ARM") |>
  tern::surv_time(
    vars = "AVAL",
    is_event = "is_event"
  )

my_anl <- tibble::tribble(
  ~AVAL, ~ARM, ~is_event,
  1,  "B",     FALSE,
  2,  "B",     TRUE,
  1,  "A",     FALSE,
  2,  "A",     TRUE
)

rtables::build_table(lyt = lyt, df = my_anl)

averissimo commented 5 months ago

Yup, That is the case of a real study data we are working on.

I think the solution show still generates a valid table (even if a split only has events).

So either have the same result as previous versions of tern, or add relevant annotations on the table.

insightsengineering / tern