Open tmastny opened 4 years ago
This should work, but I can't immediately understand why it doesn't:
library(dplyr, warn.conflicts = FALSE)
df <- tibble(
id = 1:5,
w = c(10, NA, NA, NA, 14),
x = c(NA, 21, 22, 23, NA),
y = c(NA, NA, 32, 33, NA),
z = c(NA, NA, NA, 43, 44)
)
df %>%
mutate(a = coalesce(!!!across(-id)))
#> Error in .subset2(chunks, self$get_current_group()): attempt to select less than one element in integerOneIndex
Created on 2020-04-14 by the reprex package (v0.3.0)
splicing happens "too early", but this works:
library(dplyr, warn.conflicts = FALSE)
df <- tibble(
id = 1:5,
w = c(10, NA, NA, NA, 14),
x = c(NA, 21, 22, 23, NA),
y = c(NA, NA, 32, 33, NA),
z = c(NA, NA, NA, 43, 44)
)
coacross <- function(...) {
coalesce(!!!across(...))
}
df %>%
mutate(a = coacross(-id))
#> # A tibble: 5 x 6
#> id w x y z a
#> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 10 NA NA NA 10
#> 2 2 NA 21 NA NA 21
#> 3 3 NA 22 32 NA 22
#> 4 4 NA 23 33 43 23
#> 5 5 14 NA NA 44 14
Created on 2020-04-15 by the reprex package (v0.3.0)
Feature request: coalesce working backwards, i.e. returning the last non-missing column: coalesce()
returns the first non-missing passed column/vector value. However, there are use-cases where the opposite would be helpful, i.e. returning the last non-missing value from several columns/vectors.
In case anyone comes across this issue after googling, another workaround is to use do.call(coalesce, across(-id))
, which is a little less typing than coalesce(!!!syms(vars_select(names(.), -id))))
and no extra package.
If you want to do it in reverse you could just rev
the input to coalesce
, although that's probably inefficient.
library(dplyr, warn.conflicts = FALSE)
df <- tibble(
id = 1:5,
w = c(10, NA, NA, NA, 14),
x = c(NA, 21, 22, 23, NA),
y = c(NA, NA, 32, 33, NA),
z = c(NA, NA, NA, 43, 44)
)
df %>%
mutate(a = do.call(coalesce, across(-id)))
#> # A tibble: 5 × 6
#> id w x y z a
#> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 10 NA NA NA 10
#> 2 2 NA 21 NA NA 21
#> 3 3 NA 22 32 NA 22
#> 4 4 NA 23 33 43 23
#> 5 5 14 NA NA 44 14
df %>%
mutate(a = do.call(coalesce, rev(across(-id))))
#> # A tibble: 5 × 6
#> id w x y z a
#> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 10 NA NA NA 10
#> 2 2 NA 21 NA NA 21
#> 3 3 NA 22 32 NA 32
#> 4 4 NA 23 33 43 43
#> 5 5 14 NA NA 44 44
Created on 2021-08-04 by the reprex package (v2.0.0)
What about:
df <- tibble(
id = 1:5,
w = c(10, NA, NA, NA, 14),
x = c(NA, 21, 22, 23, NA),
y = c(NA, NA, 32, 33, NA),
z = c(NA, NA, NA, 43, 44)
)
df %>%
mutate(a = coalesce(!!!select(., -id)))
# A tibble: 5 x 6
id w x y z a
<int> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 10 NA NA NA 10
2 2 NA 21 NA NA 21
3 3 NA 22 32 NA 22
4 4 NA 23 33 43 23
5 5 14 NA NA 44 14
Since we're revisiting coalesce()
and I see some feature requests gathered here, what about overriding other values than NAs ?
The use case is data where missing or special values are encoded as 0
, -1
, Inf
, NaN
, "non available"
etc.
We have na_if()
but we need to use it on all coalesced columns, and might need to turn the NAs back to their special values afterwards. It would be handy if coalesce()
handled it.
Wailing, gnashing my teeth, rending my clothing in the streets because coalesce(across(...))
still doesn't work.
With dplyr 1.0.0 introducing
c_across
andacross
I was wondering if it was possible to revisit tidyverse/dplyr#3548, by allowingdplyr::coalesce
to work more naturally with the newacross
orc_across
functions.After reading the row-wise article, I expected
dplyr::coalesce
to work likerowSums
since it naturally works across rows, or at worst it would work likerowwise
=>sum
.However,
coalesce
doesn't seem to work with theacross
family at all, as you can see in the code below.Would it be possible to make
coalesce
compatible with the newacross
workflow?Created on 2020-04-14 by the reprex package (v0.3.0)