Shard pruning is performed twice for non-router plannable queries

mtuncer commented 8 years ago

We do shard pruning to decide if a query is router plannable in early stage of the planning. If we decide that the query is not router plannable, we fall back to regular plannar and in the end we perform another shard pruning to decide the shards we need.

We should be able to use previously pruned shards without performing another pruning.

ozgune commented 8 years ago

@mtuncer -- For our production users who have large shard counts (100K-1M), will this introduce a performance issue?

mtuncer commented 8 years ago

@ozgune if they are on a hash-partitioned table, yes. Those users with that many shard counts are probably using append partitioning and they should be safe.

ozgune commented 8 years ago

@mtuncer, thanks. If this only impacts customers who have hash distributed tables, then that's good.

We still have customers who use the real-time executor on their hash distributed tables (256-1025 shards). I'm guessing this change may not impact them though. Could you explain the following?

How do we decide to use the router planner and executor?
When does the router planner's shard pruning logic kick in?

In particular, if an incoming query doesn't have a filter clause on the distribution column, do we drop into the router planner's shard pruning logic? -- I'll also send an internal email that has a bit more specific info.

mtuncer commented 8 years ago

How do we decide to use the router planner and executor?

We first check executor type to see if it is real time executor, we then check distribution method for each relation referenced in the query. We require that each relation is distributed using hash method. Then we rely on shard pruning to decide if each relation can be reduced to a single shard.

If the query references more than one relation, we bail out this check as soon as we realize one relation is reduced to more than one shard.

When does the router planner's shard pruning logic kick in? This logic is activated for almost all select queries passing executor and distribution method checks.

If incoming query does not have a filter, we still get into shard pruning logic.

mtuncer commented 8 years ago

I looked at shard pruning logic to see what happens when there are no constraints. It walks through each shard interval with no restriction, then each call to predicate_refuted_by() returns immediately without performing any check. So only extra time spent includes iteration time over shard intervals, bunch of hash comparison criteria creation, and a call to predicate_refuted_by which return immediately.

ozgune commented 8 years ago

@mtuncer -- Based on this, are we then expecting the shard pruning overhead to be very low (when we don't have a filter clause on the distribution column)? Could we quantify this overhead with an example data set and queries?

As a separate question, how hard is it to introduce a semantic/logical check to verify the user's intent with this query?

anarazel commented 8 years ago

On 2016-08-01 11:03:30 -0700, Ozgun Erdogan wrote:

As a separate question, how hard is it to introduce a semantic/logical check to verify the user's intent with this query?

I think the issue is less how hard that is, the question is more how many legitimate queries such checks would prohibit from being run.

ozgune commented 8 years ago

@mtuncer -- How much effort is it to run a few performance numbers on 128/256 shards to measure the planning overhead?

I added a spreadsheet to our shared Dev (team) > Benchmarks folder for a user who has 256 shards and latency expectations of under 300ms. I'm curious how much additional planning overhead we'd introduce for their workload.

mtuncer commented 8 years ago

@ozgune we should be able to get those numbers in a day including setup time.

Shall we prioritize this ?

mtuncer commented 8 years ago

I run a aggregation query hitting all shards on a single hash distributed table. Aggregations include count/sum/hll_union. There is no filter on the partition column, group by clause includes partition column.

Planner first tries to create a router plan, fails, and follows the default planning path. Time to try router planner, and total time spent in planning per table shard count is listed below.

shard Count	router planner time(ms)	total planner time (ms)
16	0.175	8.363
32	0.176	9.061
64	0.317	14.812
128	0.929	32.441
256	1.118	54.205
512	1.570	111.637
1024	3.911	208.870

Looks like dual shard pruning has close to no effect on total time spend in planner. Overall performance penalty is around 1.5% to 2% of total planning time.

I think we can live with this.

mtuncer commented 8 years ago

These numbers are taken by instrumenting CreatePhysicalPlan() function. They were run on Mac Air with 1.7 GHz i7 processor and 4 GB of ram. PostgreSQL 9.5.4 and Citus 5.2 are compiled in DEBUG mode with optimizations disabled.

onderkalaci commented 7 years ago

Could we use relationRestrictionContext to hold/save the prunedShards so that multi-planner can use that?

jasonmp85 commented 7 years ago

Could we use relationRestrictionContext to hold/save the prunedShards so that multi-planner can use that?

I haven't looked at what that does in a while, but my gut reaction is :+1: to that?

citusdata / citus

Shard pruning is performed twice for non-router plannable queries #681