futureverse / future.apply

:rocket: R package: future.apply - Apply Function to Elements in Parallel using Futures
https://future.apply.futureverse.org
211 stars 16 forks source link

Implement future version of .mapply() ? #54

Closed mllg closed 4 years ago

mllg commented 4 years ago

.mapply() is in base R as "internal" function, nevertheless a candidate for this package.

I use it quite often to avoid the do.call(mapply, <list_of_arguments>) pattern. This is especially useful if you need to iterate over a data.frame-like tabular structure in a row-wise fashion.

HenrikBengtsson commented 4 years ago

Ha, I think this is the first time I've heard of base::.mapply(). I guess it's important to make it crystal clear that the help page says: "Internal objects in the base package most of which are only user-visible because of the special nature of the base namespace.". That sounds like a disclaimer for "We wanted to make these internal but we couldn't right now but we might be able to do it later".

Can you clarify further will a small example that illustrates how using mapply() and .mapply() would differ. I guess it has to do with the difference between the ... and dots arguments;

> args(base::mapply)
function (FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE) 
NULL

> args(base::.mapply)
function (FUN, dots, MoreArgs) 
NULL
mllg commented 4 years ago

Can you clarify further will a small example that illustrates how using mapply() and .mapply() would differ. I guess it has to do with the difference between the ... and dots arguments;

Sure, here is an example where I apply a function rowwise over a data.frame to create a CSV-like representation:

csvify = function(...) paste(..., sep = ";")

.mapply(csvify, iris, list())

dots = c(list(FUN = csvify), iris)
do.call(mapply, dots)

Note that apply() often is not suitable as it converts to a matrix first, and by doing so often converts everything to character().

HenrikBengtsson commented 4 years ago

So, it looks like .mapply() uses non-default SIMPLIFY = FALSE, so you need to do:

csvify <- function(...) paste(..., sep = ";")
y0 <- .mapply(csvify, iris, MoreArgs = list())
args <- c(list(FUN = csvify), iris, list(SIMPLIFY = FALSE))
y1 <- do.call(mapply, args = args)
stopifnot(identical(y1, y0))

to replicate with mapply(). This gives a first prototype to be:

future_.mapply <- function(FUN, dots, MoreArgs) {
  args <- c(list(FUN = FUN), dots,
            list(MoreArgs = MoreArgs, SIMPLIFY = FALSE, USE.NAMES = FALSE))
  do.call(future_mapply, args = args, envir = parent.frame())
}

EDIT 2020-04-05: Updated above prototype to use USE.NAMES = FALSE (was USE.NAMES = TRUE). Adding package tests revealed this.

HenrikBengtsson commented 4 years ago

Added for the next release.

BTW, I've fixed above prototype to use USE.NAMES = FALSE (was TRUE).

HenrikBengtsson commented 4 years ago

future 1.5.0 with future_.mapply() is now on CRAN