cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.89k stars 3.77k forks source link

opt: improve functional dependencies #75070

Open mgartner opened 2 years ago

mgartner commented 2 years ago

When adding UNIQUE WITHOUT INDEX constraints for the non-shard columns of a hash-sharded index in #74235, we noticed two limitations of functional dependencies that required a hacky work-around to prevent the optimizer from created unnecessary uniqueness checks. The following improvements to FDs should allow us to remove the hack, and could lead to better query plans in other cases unrelated to hash-sharded indexes.

1. Keep track of the "laxness" of each column of a key.

Consider the FD:

key(11); (11)-->(12,13), (12,13)~~>(11), (12)-->(13)

If 13 is nullable, then 12 is not a lax key. For example, the FD allows for (c12, c13, c11) to have values (1, null, 1) and (1, null, 2).

But if 13 is NOT NULL, then 12 is a lax key. Two rows with the same non-null value for 12 means that they have the same non-null value for 13, which means that that they have the same value for all columns.

However, our FD implementation cannot determine this because "laxness" is a property of a determinant -> dependency relationship. There is no way to fully represent the (12,13)~~>(11) dependency where 13 is NOT NULL, so information is lost when encoding these relationships as an FD. If "laxness" was a property of the columns within a determinant, then we could determine that sets of columns are lax keys in more cases. For example, from (12,~13)-->(11), (12)-->(13) we could determine that 12 is a lax key.

2. Further simplify FDs.

Consider the FD:

key(11); (11)-->(12-15), (12,13)-->(11,14,15), (12)-->(13)

By transitivity, (12,13)-->(11,14,15) and (12)-->(13) imply (12)-->(11,13-15), so the FD should be simplified to:

key(11); (11)-->(12-15), (12)-->(11,13-15)

See https://github.com/cockroachdb/cockroach/pull/74235#issuecomment-1013238525 for more details.

Jira issue: CRDB-12452

mgartner commented 1 year ago

(2) sounds the same or similar to #74717.