tidyverse / multidplyr

A dplyr backend that partitions a data frame over multiple processes
https://multidplyr.tidyverse.org
Other
641 stars 75 forks source link

Standard evaluation version for partition #12

Closed fugufisch closed 7 years ago

fugufisch commented 8 years ago

I'd like to use partition for programming like I use group_by_ in dplyr. I tried to export the existing partition_ function in multidplyr, but that didn't do the trick.

aavilaherrera commented 8 years ago

It seems one can wrap groups in lazyeval::all_dots(). Here's an example adapted from the vignette:

dots <- list(~carrier, lazyeval::interp(~VarName, VarName = as.name('year')))
dots
#> [[1]]
#> ~carrier
#> 
#> [[2]]
#> ~year
flights1 <- multidplyr:::partition_(flights, group = lazyeval::all_dots(dots))
flights2 <- summarise(flights1, dep_delay = mean(dep_delay, na.rm = TRUE))
flights3 <- collect(flights2)

Not sure if this is appropriate for all cases and it may not be good advice.

hadley commented 7 years ago

Fixed in #41