ddsjoberg / gtsummary

Presentation-Ready Data Summary and Analytic Result Tables
http://www.danieldsjoberg.com/gtsummary
Other
1.04k stars 116 forks source link

`modify_header()` interacts with `as_hux_table()` yielding sample sizes with scientific notation #1462

Closed shannonpileggi closed 1 year ago

shannonpileggi commented 1 year ago

When using modify_header() in tbl_summary() with hux functions, like as_hux_table() or as_hux_xlsx(), larger sample sizes get converted to scientific notation.

A hacky fix for this is to use huxtable::set_number_format(NA).

library(tidyverse, warn.conflicts = FALSE)
#> Warning: package 'dplyr' was built under R version 4.2.2
library(gtsummary)
#> Warning: package 'gtsummary' was built under R version 4.2.2
library(huxtable)
#> 
#> Attaching package: 'huxtable'
#> The following object is masked from 'package:gtsummary':
#> 
#>     as_flextable
#> The following object is masked from 'package:dplyr':
#> 
#>     add_rownames
#> The following object is masked from 'package:ggplot2':
#> 
#>     theme_grey

n <- 2642

df_ex <- tibble(
  var1 = rep(1, n),
  var2 = c(rep("A", n/2), rep("B", n/2))
)

t_ex <- df_ex |> 
  tbl_summary(by = var2) |> 
  modify_header(all_stat_cols() ~ "**{level}**  \nN = {n}")

h_ex <- as_hux_table(t_ex)
h_ex
#> Warning in knit_print.huxtable(x, ...): Unrecognized output format "gfm-yaml". Using `to_screen` to print huxtables.
#> Set options("huxtable.knitr_output_format") manually to "latex", "html", "rtf", "docx", "pptx", "md" or "screen".
             Characteristic   AN = 1.32e+03   BN = 1.32e+03  
           ──────────────────────────────────────────────────
             var1             1,321 (100%)    1,321 (100%)   
           ──────────────────────────────────────────────────
             n (%)                                           

Column names: label, stat_1, stat_2


h_ex_fixed <- h_ex |>
  huxtable::set_number_format(NA)

h_ex_fixed
#> Warning in knit_print.huxtable(x, ...): Unrecognized output format "gfm-yaml". Using `to_screen` to print huxtables.
#> Set options("huxtable.knitr_output_format") manually to "latex", "html", "rtf", "docx", "pptx", "md" or "screen".
              Characteristic    AN = 1321      BN = 1321    
            ────────────────────────────────────────────────
              var1             1,321 (100%)   1,321 (100%)  
            ────────────────────────────────────────────────
              n (%)                                         

Column names: label, stat_1, stat_2

Created on 2023-02-17 with reprex v2.0.2

ddsjoberg commented 1 year ago

Hmmm, trying to think of a reasonable solution here....there are some that would surely rely on the huxtable default behavior? You can avoid the default formatting in huxtable by specifying the format of the header.

modify_header(all_stat_cols() ~ "**{level}**  \nN = {style_number(n)}")

Also, we can update the PCCTC gtsummary theme to change the default header to have the line break between the level and the Ns and to style the Ns. Perhaps that is the best way around this at this point? What do you think @shannonpileggi ?

library(gtsummary)

n <- 2642

df_ex <- dplyr::tibble(
  var1 = rep(1, n),
  var2 = c(rep("A", n/2), rep("B", n/2))
)

df_ex |> 
  tbl_summary(by = var2) |> 
  modify_header(all_stat_cols() ~ "**{level}**  \nN = {style_number(n)}") |> 
  as_hux_table()
#> Warning in knit_print.huxtable(x, ...): Unrecognized output format "gfm-yaml". Using `to_screen` to print huxtables.
#> Set options("huxtable.knitr_output_format") manually to "latex", "html", "rtf", "docx", "pptx", "md" or "screen".
              Characteristic    AN = 1,321     BN = 1,321   
            ────────────────────────────────────────────────
              var1             1,321 (100%)   1,321 (100%)  
            ────────────────────────────────────────────────
              n (%)                                         

Column names: label, stat_1, stat_2

Created on 2023-02-17 with reprex v2.0.2

shannonpileggi commented 1 year ago

I agree, let's move the solution to croquet.