Closed ddsjoberg closed 1 month ago
I think the issue here is that when by
is not specified, the function performs location matching of the argument, and therefore, thinks by = ard_categorical(variables = "AGEGR1")
.
If we want the by
argument to be optional, we should probably move it to after the ...
? The other alternative would be to keep it where it is, but we can't have a default value, e.g. ard_stack(data, by, ..., .attributes = FALSE, .missing = FALSE, .shuffle = FALSE)
, which would require the user to input something.
@bzkrouse what do you think? Do you have a preference for one of these options? I assume tfrmt users could also want this functionality to shuffle easily?
library(cards)
packageVersion("cards")
#> [1] '0.1.0.9023'
# everything does work if we specify `by=NULL`
ard_stack(
data = ADSL,
by = NULL,
ard_categorical(variables = "AGEGR1"),
ard_continuous(variables = "AGE")
)
#> {cards} data frame: 17 x 9
#> variable variable_level context stat_name stat_label stat
#> 1 AGEGR1 <65 categori… n n 33
#> 2 AGEGR1 <65 categori… N N 254
#> 3 AGEGR1 <65 categori… p % 0.13
#> 4 AGEGR1 >80 categori… n n 77
#> 5 AGEGR1 >80 categori… N N 254
#> 6 AGEGR1 >80 categori… p % 0.303
#> 7 AGEGR1 65-80 categori… n n 144
#> 8 AGEGR1 65-80 categori… N N 254
#> 9 AGEGR1 65-80 categori… p % 0.567
#> 10 AGE continuo… N N 254
#> 11 AGE continuo… mean Mean 75.087
#> 12 AGE continuo… sd SD 8.246
#> 13 AGE continuo… median Median 77
#> 14 AGE continuo… p25 25th Per… 70
#> 15 AGE continuo… p75 75th Per… 81
#> 16 AGE continuo… min Min 51
#> 17 AGE continuo… max Max 89
#> ℹ 3 more variables: fmt_fn, warning, error
Created on 2024-05-08 with reprex v2.1.0
OPTION 1: Move the argument
ard_stack(data, ..., .attributes = FALSE, .by = NULL, .missing = FALSE, .shuffle = FALSE)
the reason I like this option is because we could easily extend ard_stack()
with additional arguments, and it wouuld be clear that they all appear at the end. Otherwise, if we kept the position where it is now, and wanted to add new arguments it could be very disruptive.
We could also make the default .by = dplyr::group_vars(data)
like the ard summary functions, so if you do use grouped data frames, you wouldn't need to supply a value.
OPTION 2: Remove the argument default
ard_stack(data, by, ..., .attributes = FALSE, .missing = FALSE, .shuffle = FALSE)
I like that it feels like the natural place for the by argument. But with that said, users can still place the argument there with ard_stack(ADSL, .by = ARM, ...)
I guess I am leaning toward option 1
In support of option #1 and the proposed default of group_vars(data)
would be convenient!!
It's useful to use
ard_stack()
without aby
argument for the.attributes
and.missing
arguments.Created on 2024-05-08 with reprex v2.1.0