mlr-org / parallelMap

R package to interface some popular parallelization backends with a unified interface
https://parallelmap.mlr-org.com
Other
57 stars 14 forks source link

Can we have `parallelLapply` that does not drop list element names? #58

Closed GegznaV closed 4 years ago

GegznaV commented 6 years ago

Regular lapply() returns a named list and parallelLapply() -- unnamed. Concider following simplified example.

# Function
my_fun <- function(gr, x) cbind(as.data.frame(x), gr)

# Data
input_df <- data.frame(a = 6, b = 9, c = 12)
mat <- matrix(1:2, ncol = 2)
# Parallel lapply
(result_parallel <- parallelMap::parallelLapply(xs = input_df, fun = my_fun, x = mat))
#> [[1]]
#>   V1 V2 gr
#> 1  1  2  6
#> 
#> [[2]]
#>   V1 V2 gr
#> 1  1  2  9
#> 
#> [[3]]
#>   V1 V2 gr
#> 1  1  2 12
# Regular lapply
(result_regular <- lapply(X = input_df, FUN = my_fun, x = mat))
#> $a
#>   V1 V2 gr
#> 1  1  2  6
#> 
#> $b
#>   V1 V2 gr
#> 1  1  2  9
#> 
#> $c
#>   V1 V2 gr
#> 1  1  2 12
# Compare
all.equal(result_parallel, result_regular)
#> [1] "names for current but not for target"

In my application, it's crucial to keep the names.

Question: Does the order of the elements in the output list of parallelLapply() is always the same as the order in the input list? I.e. is it always safe to use this code to identify list elements by name:

names(result_parallel) <- names(input_df)

a. If yes, then why does parallelMap not have an option for this kind of output? Are there any issues related to named lists? b. If no, under which circumstances the order of output elements gets (and under which doesn't get) distorted in comparison to the order of the input elements?

jakob-r commented 6 years ago

treated in #59

pat-s commented 4 years ago

fixed in #59