atorus-research / Tplyr

https://atorus-research.github.io/Tplyr/
Other
95 stars 16 forks source link

Insert row breaks between layers without removing duplicated values #183

Open johanneswerner opened 5 months ago

johanneswerner commented 5 months ago

I am aware that I can use apply_row_masks to add row breaks between layers, however this method also removes duplicated values between columns. How can I add row breaks without removing duplicated values above each other? Thank you!

mstackhouse commented 5 months ago

Can you provide a reprex? There are some issues that I know need to be fixed within this function, but distinct() is only run when creating the breaks and not the data itself. So an example of what you're encountering would help investigation.

johanneswerner commented 5 months ago

You are right, my apologies. Please see below the example:

library(Tplyr)

df <- data.frame(
  "Subject" = c("A001", "A002", "B001", "B002", "B003"),
  "Cohort" = c("A", "A", "B", "B", "B"),
  "AE_related" = c("No", "No", "No", "No", "No"),
  "DLT" = c("No", "No", "No", "No", "No"),
  "SAE" = c("Yes", "No", "No", "No", "No")
)

tbl <- tplyr_table(df, Cohort) %>%
  add_layer(
    group_count(AE_related, by = "AE_related") %>%
      set_distinct_by(Subject) %>%
      set_format_strings(f_str("xxx (xx.x%); xxx", distinct_n, distinct_pct, n))
  ) %>%
  add_layer(
    group_count(DLT, by = "DLT") %>%
      set_distinct_by(Subject) %>%
      set_format_strings(f_str("xxx (xx.x%); xxx", distinct_n, distinct_pct, n))
  ) %>%
  add_layer(
    group_count(SAE, by = "SAE") %>%
      set_distinct_by(Subject) %>%
      set_format_strings(f_str("xxx (xx.x%); xxx", distinct_n, distinct_pct, n))
  ) %>%
  add_total_group()

If I am building the output, I receive the following output:

tbl %>%
  build()

# A tibble: 4 × 8
  row_label1 row_label2 var1_A              var1_B              var1_Total          ord_layer_index ord_layer_1 ord_layer_2
  <chr>      <chr>      <chr>               <chr>               <chr>                         <int>       <int>       <dbl>
1 AE_related No         "  2 (100.0%);   2" "  3 (100.0%);   3" "  5 (100.0%);   5"               1           1           1
2 DLT        No         "  2 (100.0%);   2" "  3 (100.0%);   3" "  5 (100.0%);   5"               2           1           1
3 SAE        No         "  1 (50.0%);   1"  "  3 (100.0%);   3" "  4 (80.0%);   4"                3           1           1
4 SAE        Yes        "  1 (50.0%);   1"  "  0 ( 0.0%);   0"  "  1 (20.0%);   1"                3           1           2

If I would like to introduce empty rows between the different layer, I can use apply_row_masks().

tbl %>%
  build() %>%
  apply_row_masks(row_breaks = TRUE)

# A tibble: 7 × 9
  row_label1   row_label2 var1_A              var1_B              var1_Total     ord_layer_index ord_layer_1 ord_layer_2 ord_break
  <chr>        <chr>      <chr>               <chr>               <chr>                    <int>       <int>       <dbl>     <dbl>
1 "AE_related" "No"       "  2 (100.0%);   2" "  3 (100.0%);   3" "  5 (100.0%)…               1           1           1         1
2 ""           ""         ""                  ""                  ""                           1          NA          NA         2
3 "DLT"        ""         "  2 (100.0%);   2" "  3 (100.0%);   3" "  5 (100.0%)…               2           1           1         1
4 ""           ""         ""                  ""                  ""                           2          NA          NA         2
5 "SAE"        ""         "  1 (50.0%);   1"  "  3 (100.0%);   3" "  4 (80.0%);…               3           1           1         1
6 ""           "Yes"      "  1 (50.0%);   1"  "  0 ( 0.0%);   0"  "  1 (20.0%);…               3           1           2         1
7 ""           ""         ""                  ""                  ""                           3          NA          NA         2

However, now the repeating value "No" value in row_label2 gets blanked out over different layers. How can I suppress this behavior?

I would like to implement row breaks between layers without blanking out repeating values (or only blanking out inside of the same layer).

Thank you very much!

I am running Tplyr v. 1.2.1