Open DrewKimball opened 2 years ago
We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!
Describe the problem
Currently, sets of equivalent columns are represented inefficiently in functional dependencies. Consider a set of three equivalent columns
(a, b, c)
. AFuncDepSet
would represent this equivalence as(a=b,c), (b=a,c), (c=a,b)
, which would require 6opt.ColSet
structs to be allocated. In general, the number ofopt.ColSets
needed is2n
wheren
is the number of columns in the equivalence. This can cause a large number of allocations when there are a large number of expressions in the memo with many equivalent columns, especially once the columns exceed the size of the small set.To Reproduce
Profiling TPCH Q2 shows many allocations resulting from handling equivalencies. Additionally, customers have run into this issue causing long planning time.
Expected behavior
Each equivalence group should be represented by a single
opt.ColSet
rather than2n
of them.