tidyverse / multidplyr

A dplyr backend that partitions a data frame over multiple processes
https://multidplyr.tidyverse.org
Other
641 stars 75 forks source link

evenly divide rows for ungrouped data frame #156

Open steveharoz opened 6 months ago

steveharoz commented 6 months ago

The documentation for partition() discusses what it does for grouped data:

Partitioning ensures that all observations in a group end up on the same worker

But what about ungrouped data? In those cases, it'd make sense to evenly divide the rows between cores.

If it already evenly divides the rows between cores, please document it. If not, please implement it.