Closed danzafar closed 4 years ago
OK now I’m seeing where some coalesce confusion is coming from. It turns out that dplyr::coalesce
is going to be a namespace conflict. dplyr::coalesce
look like:
> dplyr::coalesce
function (...)
{
if (missing(..1)) {
abort("At least one argument must be supplied")
}
values <- list2(...)
x <- values[[1]]
values <- values[-1]
for (i in seq_along(values)) {
x <- replace_with(x, is.na(x), values[[i]], glue("Argument {i + 1}"),
glue("length of {fmt_args(~x)}"))
}
x
}
which is definitely not going to be flexible for accepting a spark_tbl
. We can either try to hack the API somehow or just accept the namespace conflict for coalesce and write this method:
coalesce.data.frame <- function(...) {
dplyr::coalesce(...)
}
which will dispatch a data.frame
(or tbl
) to the right place
@jcamstan3370 as we discussed, for the dplyr coalesce the first value will have to be a Column value for it to work on spark_tbls. In case the first value needs to be a constant (not sure how that's possible), I added an as.Column
(also lit
and as_Column
) function to help out with this. See #47 .
A good beginner task, at least I think....