zhouqingqing / qpmodel

A Relational Optimizer and Executor
MIT License
64 stars 18 forks source link

Enable replicated enforcement and implement broadcast execution #199

Closed arzuschen closed 3 years ago

arzuschen commented 3 years ago

This patch consists of two main parts: 1) Enable replicated enforcement Now replicated distribution can be enforced from distribution on any list expression through broadcast node. This make it possible for hashjoining two distributed tables by broadcasting the smaller table for the build end. Therefore the total cost may decrease since a join previously requiring two redistribution can now be replaced by only one broadcast. 2) Implement broadcast execution Broadcast physic node execution is implemented using shared code from redistribution. The class of Physic RemoteExchange is modified to adopt to the shared portion for broadcast and redistribution. The estimated cost for broadcast is set to the same as redistribution, which is 2.0x the cardinality of child node.

After the above is implemented, the plans for redistribution unittest changes since using broadcasting is likely to produce lower cost plans. Therefore, a parameter in option.optimize is introduced to turn on/off the replicated distribution enforcement. This way the plans for redistribution test does not need to change. Distributed TPCH result plans are changed accordingly.