Benchmark to Measure Impact of Planner Changes

Background

We recently added a new command vexplain trace to Vitess. This command runs a query under tracing mode and provides detailed information about operator calls and data flow. We want to leverage this to create a benchmark tool that can measure the impact of planner changes.

Objective

Create a benchmark tool to measure the impact of planner changes by comparing the vexplain trace output before and after the change.

Advantages Over Traditional Time-Based Benchmarking

This approach offers several significant advantages over traditional time-based benchmarking:

Consistency: Unlike time-based benchmarks that can vary due to system load, network conditions, or other external factors, our metrics are deterministic and directly reflect the query plan's efficiency.
No Special Hardware Required: Traditional benchmarks often require dedicated, bare-metal hardware to ensure consistent results. Our tool can run on any system, making it more accessible for developers and CI/CD pipelines.
Eliminate Warm-up Time: Time-based benchmarks typically require warm-up runs to reach a steady state. Our approach provides meaningful results from a single run, saving time and resources.
Direct Insight into Planner Changes: By focusing on metrics like route calls, data sent, and memory usage, we get a clear picture of how planner changes affect query execution, rather than just seeing an overall time difference.
Proxy for Performance: While we ultimately care about query execution time, these metrics serve as reliable proxies that are much easier to measure precisely and consistently.
Easier Comparison: With standardized metrics, it's easier to compare results across different environments or Vitess versions, providing a more robust basis for decision-making.
Faster Iteration: The ability to quickly and reliably measure the impact of planner changes allows for faster development cycles and more confident optimizations.

This tool will make it significantly easier to evaluate and compare the effects of planner changes, providing a valuable complement to our existing performance measurement techniques.

Key Metrics to Measure

Data Sent from VTTablets
- Measure the Route operator
- Track the amount of data sent to VTGate
- Count the number of calls to routes
Route Efficiency
- Consider the work required by different route types (Scatter, Equal, EqualUnique)
- Compare the efficiency of routes returning similar amounts of data
Memory Consumption
- Measure memory usage of different operators
- Focus on memory-intensive operations like memory sort and hash join

Implementation Details

The benchmark tool will be implemented as an extension to the existing vitess-tester tooling.

Usage

# Install Vitess version X
vitess-tester --trace myQueries-X.trace t/myQueries.test -vschema t/vschema.json

# Install Vitess version Y
vitess-tester --trace myQueries-Y.trace t/myQueries.test -vschema t/vschema.json

# Compare traces
vitess-tester --trace-compare myQueries-X.trace myQueries-Y.trace

Test File Structure

The test files will follow this general structure:

CREATE TABLE statements to set up the schema
INSERT statements to populate the data
SELECT statements to run the queries

No additional directives are needed in the test files.

Mock Output Example

Trace Comparison: myQueries-X.trace vs myQueries-Y.trace

Query 1: SELECT * FROM customers WHERE id = 1
-------------------------------------------------------------------------
|                    | Version X | Version Y | Diff    | % Change      |
-------------------------------------------------------------------------
| Route Calls        | 1         | 1         | 0       | 0%            |
| Data Sent (rows)   | 1         | 1         | 0       | 0%            |
| Route Type         | Equal     | Equal     | -       | -             |
| Memory Usage (MB)  | 0.01      | 0.01      | 0       | 0%            |
-------------------------------------------------------------------------

Query 2: SELECT o.id, c.name FROM orders o JOIN customers c ON o.customer_id = c.id WHERE o.status = 'shipped'
-------------------------------------------------------------------------
|                    | Version X | Version Y | Diff    | % Change      |
-------------------------------------------------------------------------
| Route Calls        | 3         | 2         | -1      | -33.33%       |
| Data Sent (rows)   | 150       | 100       | -50     | -33.33%       |
| Route Type         | Scatter   | Equal     | -       | -             |
| Memory Usage (MB)  | 5         | 2         | -3      | -60%          |
-------------------------------------------------------------------------

Summary:
- 1 out of 2 queries showed significant change
- Average change in Route Calls: -16.67%
- Average change in Data Sent: -16.67%
- Average change in Memory Usage: -30%

vitessio / vt