tidyverse / multidplyr

A dplyr backend that partitions a data frame over multiple processes
https://multidplyr.tidyverse.org
Other
641 stars 75 forks source link

Request: Partition by group_by #52

Closed ChiWPak closed 7 years ago

ChiWPak commented 7 years ago

I often can group my data into fewer groups than the number of cores on my node. I would like to partition my data into separate cores based on these groups, operate on them, and then collect the results. Is there a way to specify partition(cluster=cluster) by groups (from group_by() )?

ChiWPak commented 7 years ago

In case anyone else is interested: Partitioning by group is made easy. data %>% group_by(X) %>% partition(X)

Ax3man commented 7 years ago

Or just data %>% partition(X).