Open onderkalaci opened 7 years ago
What about adding a number of a.partition_key = b.non_partition_key restrictions? IIRC they'd be hit by the current algoirthm, and it's quite realistic to have lot of them, without absurd numbers of joins.
What about adding a number of a.partition_key = b.non_partition_key restrictions? IIRC they'd be hit by the current algoirthm, and it's quite realistic to have lot of them, without absurd numbers of joins.
@anarazel I've updated the issue, please see Query 7
to Query 10
. Do the tests make sense to you? If yes, we're still OK with those tests given that the cost of running the algorithm is not high compared to the other parts of the planning & execution. (Btw, I'm using a similar approach to measure the time passed as log_disconnections()
function in the postgres.c
)
While @anarazel reviewing the changes for subquery pushdown in #1268 , he had some concerns about the algorithmic complexity of the
RestrictionEquivalenceForPartitionKeys()
function. I'm opening this issue to discuss some more details on the specifics of the algorithm, share some benchmark results and keep track of it.Firstly, let me share the pseudo code of the discussed algorithm:
Some parts of the above algorithm have high algorithmic complexity. Especially checking the uniqueness of the equivalence members on the common equivalence class. However, as the benchmarks show, the algorithmic complexity doesn't show a big performance bottleneck, given that we're following a very conservative approach while adding a restriction to an equivalence class.
Now, I'd like to share some benchmark results. Below, I wanted to give different parts of the planning. The last item in each test shows the time that has passed for executing the whole algorithm that decides whether to push down the query or not (i.e.,
Attribute Eq. Execution
) .Note that I've run the tests on my local machine, and the test table creation queries are already in the regression tests in case anyone wants to re-produce the tests.
Query 1:
10 Joins on the partition keyCopying origianal query
: Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 0Assigning RTE Ids
: Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 0Standard planner:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 2Logical planner:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 0Attribute Eq. Execution:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 0Query 2:
100 Joins on the partition keyCopying origianal query
: Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 5Assigning RTE Ids
: Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 1Standard planner:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 25Logical planner:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 5Attribute Eq. Execution:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 2Query 3:
1000 Joins on the partition keyCopying origianal query
: Hours: 0 - Minutes: 0 - Seconds: 2 - Msecs: 845Assigning RTE Ids
: Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 354Standard planner:
Hours: 0 - Minutes: 0 - Seconds: 6 - Msecs: 630Logical planner:
Hours: 0 - Minutes: 1 - Seconds: 21 - Msecs: 811Attribute Eq. Execution:
Hours: 0 - Minutes: 0 - Seconds: 2 - Msecs: 226Query 4:
5000 Joins on the partition keyCopying origianal query
: Hours: 0 - Minutes: 1 - Seconds: 19 - Msecs: 619Assigning RTE Ids
: Hours: 0 - Minutes: 0 - Seconds: 59 - Msecs: 555Standard planner:
Hours: 0 - Minutes: 5 - Seconds: 48 - Msecs: 662Logical planner:
Hours: 0 - Minutes: 1 - Seconds: 21 - Msecs: 811Attribute Eq. Execution:
Hours: 0 - Minutes: 5 - Seconds: 0 - Msecs: 675Query 5:
10000 Joins on the partition keyQuery 6:
100 joins and each join contains 20 filters on the partition key - some filters are joins and some are consts (mostly conts areAnd
ed ):Copying origianal query
: Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 1Assigning RTE Ids
: Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 1Standard planner:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 34Logical planner:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 11Attribute Eq. Execution:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 4Query 7:
5 tables, 5 joins on partition keys, 5 joins on partition key and non partition keys ):Copying origianal query
: Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 0Assigning RTE Ids
: Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 0Standard planner:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 5Logical planner:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 6Attribute Eq. Execution:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 1Query 8:
5 tables, 5 joins on partition keys, 20 joins on partition key and non partition keys):Copying origianal query
: Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 0Assigning RTE Ids
: Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 0Standard planner:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 6Logical planner:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 7Attribute Eq. Execution:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 1Query 9:
5 tables, 5 joins on partition keys, 50 joins on partition key and non partition keys):Copying origianal query
: Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 0Assigning RTE Ids
: Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 0Standard planner:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 8Logical planner:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 9Attribute Eq. Execution:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 2Query 10:
5 tables, 5 joins on partition keys, 200 joins on partition key and non partition keys):Copying origianal query
: Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs:1Assigning RTE Ids
: Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 0Standard planner:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 14Logical planner:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 6Attribute Eq. Execution:
Hours: 0 - Minutes: 0 - Seconds: 0 - Msecs: 2As I mentioned above, the algorithmic complexity doesn't seem to lead to a performance bottleneck, given that we're following a very conservative approach while adding a restriction to an equivalence class.