Closed ChrisHIV closed 1 month ago
I think you really just need to use a basic if statement, you aren't doing anything vectorized so you don't need case-when
library(dplyr)
do_it <- function(y) {
if (length(unique(y)) == 1L) {
paste("This x has one y:", unique(y))
} else {
"This x has several ys"
}
}
tibble(x = c("A", "A", "B", "B"),
y = c("I", "I", "J", "K")) %>%
summarise(.by = x,
summary = do_it(y))
#> # A tibble: 2 × 2
#> x summary
#> <chr> <chr>
#> 1 A This x has one y: I
#> 2 B This x has several ys
A simpler example of what you are trying to demonstrate is:
dplyr::case_when(
FALSE ~ c(1, 2),
TRUE ~ 3
)
#> [1] 3 3
I actually think this should be an error. See https://github.com/tidyverse/dplyr/issues/7082#issuecomment-2334173589 where I talk about this in more detail. The RHSs of case_when()
should either have size 1 or size size
where size
comes from the size of the things on the LHS. The underlying engine already throws an error here:
dplyr:::vec_case_when(
conditions = list(FALSE, TRUE),
values = list(c(1, 2), 3)
)
#> Error in `dplyr:::vec_case_when()`:
#> ! `values[[1]]` must have size 1, not size 2.
But anyways, for your use case of having 2 conditions that basically amount to:
TRUE
FALSE
I think you are much better served by an if statement
case_when()
appears to first decide on the multiplicity of values it will return based on considering all conditions and then coerce the value specified by the satisfied condition into this multiplicity, rather than returning multiplicity of values specified by the satisfied condition. Is this a bug or intentional? It leads to unexpected behaviour when trying to use conditions related to multiplicity of values.Example
output:
together with a warning message about returning more (or less) than 1 row per
summarise()
group.output I expected