tidyverse / multidplyr

A dplyr backend that partitions a data frame over multiple processes
https://multidplyr.tidyverse.org
Other
641 stars 75 forks source link

Fix for issue with small partioning variables #8

Closed fugufisch closed 8 years ago

fugufisch commented 8 years ago

This should enable partitioning for variables that have less values than the cluster has nodes.

codecov-io commented 8 years ago

Current coverage is 19.33%

Merging #8 into master will decrease coverage by -0.33% as of b5d0c41

@@            master      #8   diff @@
======================================
  Files            9       9       
  Stmts          178     181     +3
  Branches         0       0       
  Methods          0       0       
======================================
  Hit             35      35       
  Partial          0       0       
- Missed         143     146     +3

Review entire Coverage Diff as of b5d0c41

Powered by Codecov. Updated on successful CI builds.

fugufisch commented 8 years ago

I just realized this doesn't work for cases with less shards then grouping values. I'll issue a new request with another approach.