cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.15k stars 3.81k forks source link

sql, opt: support broadcast join #84731

Open rytaft opened 2 years ago

rytaft commented 2 years ago

Is your feature request related to a problem? Please describe. Today we only support distributed joins (specifically distributed hash and merge joins) by hash-partitioning both sides of the join on the join key. However, when one side is much smaller than the other, it would be better to broadcast the smaller input to all nodes involved in the distributed join. This is commonly called a "broadcast join", and it's not currently supported in CockroachDB.

Describe the solution you'd like We should add support for broadcast join. This will require a number of updates to the execution engine. For example, broadcast joins will use "MIRROR" instead of "HASH" routing [1]. The optimizer will also need to be made aware of the cost differences between the two types of distributed joins. Solving https://github.com/cockroachdb/cockroach/issues/47226 is a prerequisite.

[1] MIRROR v HASH routing: https://github.com/cockroachdb/cockroach/blob/d1879ebd6349ea90f0c6d2355d1d437772be34bb/pkg/sql/execinfrapb/data.proto#L146-L150

Jira issue: CRDB-17833

github-actions[bot] commented 10 months ago

We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!