Distributed Query Optimizations

One of the most important parts of a database is its query planner and optimizer. This is especially true in a distributed database. There are several well-known optimizations that commercial database vendors to improve the performance of distributed queries. In this project, students will explore various techniques for improving system performance using query plan optimizations.

Improve predicate/aggregate push-down optimizations. The PlanOptimizer should determine whether a projection in a query plan can be pushed down to all nodes. For example, consider the following distributed query that retrieves the max salary based on the department from the employees table that is split across multiple partitions SELECT MAX(salary), department FROM employees GROUP BY department; Assuming that there are more columns in employees than just the two used in the query, the database will want to execute the projection in parallel at each partition so that only the two columns that are needed are sent back to the initiating node to coalesce the results. Likewise, the MAX() operation can also be executed in parallel so that each partition just sends the minimum amount of data.
Identify parts of a query plan that are better executed in Java. That is, instead of executing a PlanFragment in the ExecutionEngine, the PartitionExecutor will process it directly from within Java. The goal of this is to reduce of executing a distributed query by shortcutting certain opeations. For example, if a transaction executes a distributed UPDATE query, then the output of the top-most PlanFragment (which is executed at the transaction's base partition) is just a single value that is the sum of the single value input tables generated by the PlanFragment for the distributed UPDATE operation. This value represents the number of tuples that were modified at each partition. Thus, the PartitionExecutor can just perform this summation itself, instead of going to the ExecutionEngine. Similarily, the topmost PlanFragment for broadcast SELECT statements will often just combine the output of
- Update the PlanOptimizer to identify which PlanFragments can be executed in Java (this code may need to be somewhere else because the PlanOptimizer won't have the PlanFragments yet) and update the appropriate flags (e..g, "fastaggregate" and "fastcombine").
- Create new classes called CombineExecutor and AggregateExecutor in edu.brown.hstore.executors that implement the desired functionality.
- Add a new boolean configuration option in HStoreConf called "exec_fast_executors" that the PartitionExecutor will check at run time to determine whether to use this new feature. * Update PartitionExecutor's constructor to instantiate the new executor classes. * In PartitionExecutor.dispatchWorkFragments(), add code to check whether "exec_fast_executors" is true and whether a PlanFragment (stored in the WorkFragment) that needs to be executed on the local partition has one of the PlanFragment flags set to true.
Improve support n-way joins. When a query needs to join multiple tables together, it is important that the order in which those tables are joined minimizes both the number of tuples that must be examined and the amount of data that is sent between nodes. This is a difficult problem and database optimizers are often wrong. For example, H-Store generates a query plan for TPC-E's BrokerVolume that does not select the proper join, and causes the DBMS to send entire tables to a single node. This is because query planner does not have access to statistical information about the tables in order to make the best selection. Instead of improving the statistical analysis of H-Store's query optimizer, students will implement an alternative approach used by MongoDB where the system generates all possible plans for the query and tries them each out at run time to determine which one is the best. To do this, the catalog will need to be extended to support multiple query plans per Statement. At runtime, the system will select a random query plan for the query and keep track of how long it to execute. Once enough samples are collected, the system will then choose the query plan with the lowest run time. The information gained about query plans can be written into the catalog project jar so that future invocations of the DBMS can use the best query without having to run trials first.

Students will use H-Store's built-in benchmarks and profiling tools to measure the system before and after the query planner optimizations are implemented. Some test cases will be provided to validate that the query plans are correct, but students are strongly encourage to write their own.

apavlo / h-store

Distributed Query Optimizations #17