Closed stanstrup closed 5 years ago
Wrapping in do()
makes the above example work:
cars_serial <-
mtcars %>%
invoke_rows(.f = sum) %>%
unnest()
cars_parallel <-
mtcars %>%
partition(carb, cluster=cluster) %>%
do(invoke_rows(.f = sum, .d = .)) %>%
collect() %>%
unnest()
setdiff(cars_serial, cars_parallel) %>% nrow()
Thanks!
The work around now gives me:
Warning message:
group_indices_.grouped_df ignores extra arguments
I am not understanding what goes wrong here...
R version 3.3.3 (2017-03-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server 2008 R2 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=Danish_Denmark.1252 LC_CTYPE=Danish_Denmark.1252 LC_MONETARY=Danish_Denmark.1252 LC_NUMERIC=C
[5] LC_TIME=Danish_Denmark.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] tidyr_0.6.2.9000 purrrlyr_0.0.1.9000 multidplyr_0.0.0.9000 dplyr_0.5.0.9005
loaded via a namespace (and not attached):
[1] Rcpp_0.12.10 digest_0.6.12 withr_1.0.2 assertthat_0.2.0 R6_2.2.1 git2r_0.18.0 magrittr_1.5
[8] httr_1.2.1 rlang_0.1.9000 lazyeval_0.2.0 curl_2.6 devtools_1.13.0 tools_3.3.3 glue_1.0.0
[15] memoise_1.1.0 knitr_1.15.1 tibble_1.3.0.9006
Most likely because you have updated dplyr
to the latest dev version, but multidplyr
isn't up to date.
Sorry to resurrect this issue, I'm getting the same group_indices_.grouped_df ignores extra arguments
warning. As far as I can tell it's not creating any real issues, but I'm concerned I'm missing something. So, I'm just wondering, should I be worried?
Here's a minimal example:
library(tidyverse)
library(multidplyr)
df <- data.frame(A=c(1,2,3,4,5,6),
B=c(4,5,5,6,8,4),
group=c(1,1,1,2,2,2))
cluster <- create_cluster(2)
byGroup <- partition(df, group, cluster=cluster)
The resulting byGroup is a party_df that looks correct to me:
> byGroup
Source: party_df [6 x 3]
Groups: group
Shards: 2 [3--3 rows]
# S3: party_df
A B group
<dbl> <dbl> <dbl>
1 1 4 1
2 2 5 1
3 3 5 1
4 4 6 2
5 5 8 2
6 6 4 2
Here's the relevant parts of my sessionInfo()
:
R version 3.3.3 (2017-03-06)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Sierra 10.12.6
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] multidplyr_0.0.0.9000 modelr_0.1.1 dplyr_0.7.4 purrr_0.2.4
[5] readr_1.1.1 tidyr_0.7.2 tibble_1.3.4 ggplot2_2.2.1
[9] tidyverse_1.1.1 bnlearn_4.2
This will eventually be fixed by an implementation group_map()
/group_modify()
; I don't currently have plans to add support for purrr/purrlyr.
It seems
invoke_rows
doesn't accept aparty_df
object. That would be useful...-->
Error: .d must be a data frame