uwescience / raco

Compilation and rule-based optimization framework for relational algebra. Raco is the language, optimization, and query translation layer for the Myria project.
Other
72 stars 19 forks source link

Implement local shuffle optimization for broadcast relations #549

Open senderista opened 7 years ago

senderista commented 7 years ago

@stechu had a simple and beautiful idea: for broadcast inputs to a hash-partitioned shuffle, we can just filter them in-place to retain only tuples satisfying the hash partition condition for that worker. The same would apply to round-robin shuffles, except we would partition the tuple count on uniform boundaries. This might require support on the MyriaX side.