datafuselabs / databend

๐——๐—ฎ๐˜๐—ฎ, ๐—”๐—ป๐—ฎ๐—น๐˜†๐˜๐—ถ๐—ฐ๐˜€ & ๐—”๐—œ. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
https://docs.databend.com
Other
7.43k stars 714 forks source link

Enhanced distributed query execution #14328

Open leiysky opened 5 months ago

leiysky commented 5 months ago

So far, we only support executing distributed query with a tree-like structure.

The physical plan will be splitted into fragments by Exchange operators. Each plan has a root fragment which is only executed by the coordinator node(i.e. the node handles the query request from user) and many intermidiate/source fragments which are partitioned to all the nodes in the cluster.

Here's an example:

image

But sometimes a intermidiate/source fragment can only be executed on a single node, for example a subquery with scalar aggregation or order by clause. Because we haven't supported exchange data from a single node to multiple nodes yet, the whole parent plan is degraded to root fragment and can not leverage the clusterwise parallelism.

image

These kinds of query is very common, for example:

WITH v AS (SELECT SUM(a) AS sum FROM t)
WITH w AS (SELECT a FROM t)
SELECT * FROM v, w WHERE w.a > v.sum

This query can only be executed on the root node due to there is a scalar aggregation in v. The ideal way is the result of v can be broadcast to other nodes so we can leverage the whole cluster to execute the cross join.

leiysky commented 5 months ago

cc @xudong963 @Dousir9