Closed dereksonderegger closed 2 years ago
Dear @dereksonderegger , Thank you for the wonderfully detailed and thoughtful post. Below I've shown two somewhat simpler methods for filtering.
Overall I like the idea, but want to think on the API more before I would add it to the pkg however.
remove_row_type()
that can filter based on anything (like the version i added below)
modify_table_body()
?
Anyway, I"ll leave this open and continue to think on it. Realistically it could be some time before I actually think about it more deeply! FYI
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.6.1'
tbl <-
trial %>%
tbl_summary(by = trt)
tbl %>%
# only keep rows for age and grade
modify_table_body(
~.x %>% dplyr::filter(variable %in% c("age", "grade"))
) %>%
as_kable() # convert to kable to display in GH
Characteristic | Drug A, N = 98 | Drug B, N = 102 |
---|---|---|
Age | 46 (37, 59) | 48 (39, 56) |
Unknown | 7 | 4 |
Grade | ||
I | 35 (36%) | 33 (32%) |
II | 32 (33%) | 36 (35%) |
III | 31 (32%) | 33 (32%) |
# now export as a function
tbl_filter <- function(x, expr) {
x %>%
modify_table_body(
~.x %>% dplyr::filter({{ expr }})
)
}
tbl %>%
tbl_filter(variable %in% c("age", "grade")) %>%
as_kable() # convert to kable to display in GH
Characteristic | Drug A, N = 98 | Drug B, N = 102 |
---|---|---|
Age | 46 (37, 59) | 48 (39, 56) |
Unknown | 7 | 4 |
Grade | ||
I | 35 (36%) | 33 (32%) |
II | 32 (33%) | 36 (35%) |
III | 31 (32%) | 33 (32%) |
Created on 2022-07-13 by the reprex package (v2.0.1)
PS @dereksonderegger if you're in the regulatory space, stay tuned for a gtsummary-based pkg for adverse event reporting. It'll be presented at the R/Medicine conference in Aug (which is super inexpensive to join--between $10 and $50 depending on your role).
All of your concerns make sense and I really appreciate your thoughtfulness.
The approach of using modify_table_body()
is quite elegant/simple, but requires the user to know a bit about how gtsummary
stores the information internally and that it is the right function to use for something like that. Ultimately that can be addressed through documentation, but my expectation is that this is the type of action needed by folks that aren't necessarily experts with gtsummary()
and a user-friendly version might be nice.
Thanks for the heads up about the gtsummary
based package for adverse events. That is one aspect of the reports I have to generate and have been toying with infrastructure to automate that using gtsummary
. I'll wait to see what is already being built.
FYI @dereksonderegger i migrated this issue to bstfun, which serves somewhat as a sandbox for some gtsummary features.
@dereksonderegger what do you think of something like this? Removes all rows associated with the grade variable from the table. You can specify multiple variables and uses tidyselect
library(gtsummary)
trial %>%
tbl_summary(include = c(age, grade)) %>%
# remove all rows for grade
remove_row_type(variable = grade, type = "all") %>%
as_kable()
Characteristic | N = 200 |
---|---|
Age | 47 (38, 57) |
Unknown | 11 |
Created on 2022-07-19 by the reprex package (v2.0.1)
This is probably the cleanest way to do it and has minimal impact on the existing code base. Nice! Thanks for thinking about this!
In teams of data scientists, the creation of a table might belong to one group, but another is responsible for using the table in, for example, submission to regulatory agencies. Many times a gtsummary table might have extra variables included in the "official table" that gets validated, but aren't necessary to the current presentation or report. Instead, it would be great to be able to select only certain variables we want reporting on. Essentially, we want to hide some variables instead of rebuilding the table with a reduced variable set and possibly introducing errors and/or inconsistencies with the "official table".
Because this is hiding rows, I don't think footnotes or p-values or anything else should change.
It seems that
tbl_split()
is close to this, but is challenging to use if the variables to be kept are not adjacent to each other. As a first pass I wrote the following which seems to work. I could wrap this into a pull request if the approach I used (editing the table.body and meta.data ) is appropriate.