Closed nvanbenschoten closed 3 years ago
Here's a possibly related INSERT that is slowed down by its foreign key check: stmt-bundle-650214558116610049.zip.
I took a look at what was going wrong here and what it would take to push the constant join predicate through to the other side of the join so that its computed column could be derived and constrained. In doing so, I started playing around with other similar queries.
It turned out that when using an INT column instead of the ENUM column, things actually worked, which I wasn't expecting:
demo@127.0.0.1:26257/movr> CREATE TABLE stock (
s_i_id INT8 NOT NULL,
s_w_id INT8 NOT NULL,
s_quantity INT8 NULL,
crdb_region INT NOT VISIBLE NOT NULL AS (CASE WHEN s_w_id BETWEEN 0 AND 1665 THEN 100 WHEN s_w_id BETWEEN 1666 AND 3332 THEN 200 WHEN s_w_id BETWEEN 3333 AND 4999 THEN 300 END) STORED,
CONSTRAINT "primary" PRIMARY KEY (crdb_region ASC, s_w_id ASC, s_i_id ASC)
);
CREATE TABLE order_line (
ol_o_id INT8 NOT NULL,
ol_d_id INT8 NOT NULL,
ol_w_id INT8 NOT NULL,
ol_number INT8 NOT NULL,
ol_i_id INT8 NOT NULL,
crdb_region INT NOT VISIBLE NOT NULL AS (CASE WHEN ol_w_id BETWEEN 0 AND 1665 THEN 400 WHEN ol_w_id BETWEEN 1666 AND 3332 THEN 500 WHEN ol_w_id BETWEEN 3333 AND 4999 THEN 600 END) STORED,
CONSTRAINT "primary" PRIMARY KEY (crdb_region ASC, ol_w_id ASC, ol_d_id ASC, ol_o_id DESC, ol_number ASC)
);
ALTER TABLE stock INJECT STATISTICS '[
{
"columns": [
"s_i_id"
],
"created_at": "2021-04-13 19:20:01.481197",
"distinct_count": 100000,
"histo_col_type": "",
"name": "__auto__",
"null_count": 0,
"row_count": 500000000
},
{
"columns": [
"s_w_id"
],
"created_at": "2021-04-13 19:20:01.481198",
"distinct_count": 5000,
"histo_col_type": "",
"name": "__auto__",
"null_count": 0,
"row_count": 500000000
},
{
"columns": [
"s_quantity"
],
"created_at": "2021-04-13 19:20:01.481199",
"distinct_count": 91,
"histo_col_type": "",
"name": "__auto__",
"null_count": 0,
"row_count": 500000000
}
]';
EXPLAIN (opt) SELECT
count(*)
FROM
(
SELECT
DISTINCT s_i_id
FROM
order_line JOIN stock ON s_i_id = ol_i_id AND s_w_id = ol_w_id
WHERE
ol_w_id = 1 AND ol_d_id = 6 AND ol_o_id BETWEEN (3023 - 20) AND (3023 - 1) AND s_quantity < 14
);
info
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
scalar-group-by
├── columns: count:15
├── cardinality: [1 - 1]
├── stats: [rows=1]
├── cost: 8.4219795
├── key: ()
├── fd: ()-->(15)
├── prune: (15)
├── distinct-on
│ ├── columns: s_i_id:9
│ ├── grouping columns: s_i_id:9
│ ├── stats: [rows=0.817973447, distinct(9)=0.817973447, null(9)=0]
│ ├── cost: 8.38379977
│ ├── key: (9)
│ └── inner-join (lookup stock)
│ ├── columns: ol_o_id:1 ol_d_id:2 ol_w_id:3 ol_i_id:5 s_i_id:9 s_w_id:10 s_quantity:11
│ ├── key columns: [17 3 5] = [12 10 9]
│ ├── lookup columns are key
│ ├── stats: [rows=1, distinct(3)=0.821, null(3)=0, distinct(5)=0.817973447, null(5)=0, distinct(9)=0.817973447, null(9)=0, distinct(10)=0.821, null(10)=0]
│ ├── cost: 8.33562004
│ ├── fd: ()-->(2,3,10), (9)-->(11), (5)==(9), (9)==(5), (3)==(10), (10)==(3)
│ ├── project
│ │ ├── columns: "lookup_join_const_col_@12":17 ol_o_id:1 ol_d_id:2 ol_w_id:3 ol_i_id:5
│ │ ├── stats: [rows=0.821, distinct(1)=0.821, null(1)=0, distinct(2)=0.821, null(2)=0, distinct(3)=0.821, null(3)=0, distinct(5)=0.817973447, null(5)=0, distinct(17)=0.821, null(17)=0]
│ │ ├── cost: 5.02162
│ │ ├── fd: ()-->(2,3,17)
│ │ ├── scan order_line
│ │ │ ├── columns: ol_o_id:1 ol_d_id:2 ol_w_id:3 ol_i_id:5
│ │ │ ├── constraint: /6/3/2/-1/4: [/400/1/6/3022 - /400/1/6/3003]
│ │ │ ├── stats: [rows=0.821, distinct(1)=0.821, null(1)=0, distinct(2)=0.821, null(2)=0, distinct(3)=0.821, null(3)=0, distinct(5)=0.817973447, null(5)=0, distinct(2,3)=0.821, null(2,3)=0, distinct(1-3)=0.821, null(1-3)=0]
│ │ │ ├── cost: 4.9952
│ │ │ ├── fd: ()-->(2,3)
│ │ │ └── prune: (5)
│ │ └── projections
│ │ └── 100 [as="lookup_join_const_col_@12":17]
│ └── filters
│ ├── s_w_id:10 = 1 [outer=(10), constraints=(/10: [/1 - /1]; tight), fd=()-->(10)]
│ └── s_quantity:11 < 14 [outer=(11), constraints=(/11: (/NULL - /13]; tight)]
└── aggregations
└── count-rows [as=count_rows:15]
Notice the project of 100
as the 17th column with an alias of lookup_join_const_col_@12
.
However, when I switched back to the ENUM
, things were back to what I was seeing above:
demo@127.0.0.1:26257/movr> CREATE TYPE region AS ENUM ('europe-west2','us-east1','us-west1');
CREATE TABLE stock (
s_i_id INT8 NOT NULL,
s_w_id INT8 NOT NULL,
s_quantity INT8 NULL,
crdb_region region NOT VISIBLE NOT NULL AS (CASE WHEN s_w_id BETWEEN 0 AND 1665 THEN 'us-east1' WHEN s_w_id BETWEEN 1666 AND 3332 THEN 'us-west1' WHEN s_w_id BETWEEN 3333 AND 4999 THEN 'europe-west2' END) STORED,
CONSTRAINT "primary" PRIMARY KEY (crdb_region ASC, s_w_id ASC, s_i_id ASC)
);
CREATE TABLE order_line (
ol_o_id INT8 NOT NULL,
ol_d_id INT8 NOT NULL,
ol_w_id INT8 NOT NULL,
ol_number INT8 NOT NULL,
ol_i_id INT8 NOT NULL,
crdb_region region NOT VISIBLE NOT NULL AS (CASE WHEN ol_w_id BETWEEN 0 AND 1665 THEN 'us-west1' WHEN ol_w_id BETWEEN 1666 AND 3332 THEN 'europe-west2' WHEN ol_w_id BETWEEN 3333 AND 4999 THEN 'us-east1' END) STORED,
CONSTRAINT "primary" PRIMARY KEY (crdb_region ASC, ol_w_id ASC, ol_d_id ASC, ol_o_id DESC, ol_number ASC)
);
ALTER TABLE stock INJECT STATISTICS '[
{
"columns": [
"s_i_id"
],
"created_at": "2021-04-13 19:20:01.481197",
"distinct_count": 100000,
"histo_col_type": "",
"name": "__auto__",
"null_count": 0,
"row_count": 500000000
},
{
"columns": [
"s_w_id"
],
"created_at": "2021-04-13 19:20:01.481198",
"distinct_count": 5000,
"histo_col_type": "",
"name": "__auto__",
"null_count": 0,
"row_count": 500000000
},
{
"columns": [
"s_quantity"
],
"created_at": "2021-04-13 19:20:01.481199",
"distinct_count": 91,
"histo_col_type": "",
"name": "__auto__",
"null_count": 0,
"row_count": 500000000
}
]';
EXPLAIN (opt,verbose) SELECT
count(*)
FROM
(
SELECT
DISTINCT s_i_id
FROM
order_line JOIN stock ON s_i_id = ol_i_id AND s_w_id = ol_w_id
WHERE
ol_w_id = 1 AND ol_d_id = 6 AND ol_o_id BETWEEN (3023 - 20) AND (3023 - 1) AND s_quantity < 14
);
info
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
scalar-group-by
├── columns: count:15
├── cardinality: [1 - 1]
├── stats: [rows=1]
├── cost: 15.0900571
├── key: ()
├── fd: ()-->(15)
├── prune: (15)
├── distinct-on
│ ├── columns: s_i_id:9
│ ├── grouping columns: s_i_id:9
│ ├── stats: [rows=0.817973447, distinct(9)=0.817973447, null(9)=0]
│ ├── cost: 15.0518773
│ ├── key: (9)
│ └── inner-join (lookup stock)
│ ├── columns: ol_o_id:1 ol_d_id:2 ol_w_id:3 ol_i_id:5 s_i_id:9 s_w_id:10 s_quantity:11
│ ├── key columns: [17 3 5] = [12 10 9]
│ ├── lookup columns are key
│ ├── stats: [rows=1, distinct(3)=0.821, null(3)=0, distinct(5)=0.817973447, null(5)=0, distinct(9)=0.817973447, null(9)=0, distinct(10)=0.821, null(10)=0]
│ ├── cost: 15.0036976
│ ├── fd: ()-->(2,3,10), (9)-->(11), (5)==(9), (9)==(5), (3)==(10), (10)==(3)
│ ├── inner-join (cross)
│ │ ├── columns: ol_o_id:1 ol_d_id:2 ol_w_id:3 ol_i_id:5 "lookup_join_const_col_@12":17
│ │ ├── multiplicity: left-rows(zero-or-more), right-rows(one-or-more)
│ │ ├── stats: [rows=2.463, distinct(1)=0.821, null(1)=0, distinct(2)=0.821, null(2)=0, distinct(3)=0.821, null(3)=0, distinct(5)=0.817973447, null(5)=0, distinct(17)=2.463, null(17)=0]
│ │ ├── cost: 5.1216975
│ │ ├── fd: ()-->(2,3)
│ │ ├── values
│ │ │ ├── columns: "lookup_join_const_col_@12":17
│ │ │ ├── cardinality: [3 - 3]
│ │ │ ├── stats: [rows=3, distinct(17)=3, null(17)=0]
│ │ │ ├── cost: 0.04
│ │ │ ├── ('europe-west2',)
│ │ │ ├── ('us-east1',)
│ │ │ └── ('us-west1',)
│ │ ├── scan order_line
│ │ │ ├── columns: ol_o_id:1 ol_d_id:2 ol_w_id:3 ol_i_id:5
│ │ │ ├── constraint: /6/3/2/-1/4: [/'us-west1'/1/6/3022 - /'us-west1'/1/6/3003]
│ │ │ ├── stats: [rows=0.821, distinct(1)=0.821, null(1)=0, distinct(2)=0.821, null(2)=0, distinct(3)=0.821, null(3)=0, distinct(5)=0.817973447, null(5)=0, distinct(2,3)=0.821, null(2,3)=0, distinct(1-3)=0.821, null(1-3)=0]
│ │ │ ├── cost: 4.9952
│ │ │ ├── fd: ()-->(2,3)
│ │ │ └── prune: (5)
│ │ └── filters (true)
│ └── filters
│ ├── s_w_id:10 = 1 [outer=(10), constraints=(/10: [/1 - /1]; tight), fd=()-->(10)]
│ └── s_quantity:11 < 14 [outer=(11), constraints=(/11: (/NULL - /13]; tight)]
└── aggregations
└── count-rows [as=count_rows:15]
Notice the values
expression with 'europe-west2'
, 'us-east1'
, and 'us-west1'
as the 17th column with an alias of lookup_join_const_col_@12
.
With a bit of code instrumentation, I found that inside of CustomFuncs.GenerateLookupJoins
, we actually had both column filters for this crdb_region
column in the optionalFilters
slice. crdb_region IN ('europe-west2', 'us-east1', 'us-west1')
came from the call to CustomFuncs.checkConstraintFilters
and crdb_region IS 'us-east1'
came from the call to CustomFuncs.computedColFilters
.
The problem was that CustomFuncs.findJoinFilterConstants
was returning the first of these because it matched, without realizing that a better match existed later in the slice of filters.
To demonstrate this, a made the following patch and the problem disappeared. With it, we correctly constrained the crdb_region
column.
--- a/pkg/sql/opt/xform/join_funcs.go
+++ b/pkg/sql/opt/xform/join_funcs.go
@@ -223,9 +223,9 @@ func (c *CustomFuncs) GenerateLookupJoins(
// Generate implicit filters from CHECK constraints and computed columns as
// optional filters to help generate lookup join keys.
- optionalFilters := c.checkConstraintFilters(scanPrivate.Table)
- computedColFilters := c.computedColFilters(scanPrivate, on, optionalFilters)
- optionalFilters = append(optionalFilters, computedColFilters...)
+ optionalFiltersBefore := c.checkConstraintFilters(scanPrivate.Table)
+ computedColFilters := c.computedColFilters(scanPrivate, on, optionalFiltersBefore)
+ optionalFilters := append(computedColFilters, optionalFiltersBefore...)
var pkCols opt.ColList
var iter scanIndexIter
demo@127.0.0.1:26257/movr> EXPLAIN (opt,verbose) SELECT
count(*)
FROM
(
SELECT
DISTINCT s_i_id
FROM
order_line JOIN stock ON s_i_id = ol_i_id AND s_w_id = ol_w_id
WHERE
ol_w_id = 1 AND ol_d_id = 6 AND ol_o_id BETWEEN (3023 - 20) AND (3023 - 1) AND s_quantity < 14
);
info
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
scalar-group-by
├── columns: count:15
├── cardinality: [1 - 1]
├── stats: [rows=1]
├── cost: 8.4219795
├── key: ()
├── fd: ()-->(15)
├── prune: (15)
├── distinct-on
│ ├── columns: s_i_id:9
│ ├── grouping columns: s_i_id:9
│ ├── stats: [rows=0.817973447, distinct(9)=0.817973447, null(9)=0]
│ ├── cost: 8.38379977
│ ├── key: (9)
│ └── inner-join (lookup stock)
│ ├── columns: ol_o_id:1 ol_d_id:2 ol_w_id:3 ol_i_id:5 s_i_id:9 s_w_id:10 s_quantity:11
│ ├── key columns: [17 3 5] = [12 10 9]
│ ├── lookup columns are key
│ ├── stats: [rows=1, distinct(3)=0.821, null(3)=0, distinct(5)=0.817973447, null(5)=0, distinct(9)=0.817973447, null(9)=0, distinct(10)=0.821, null(10)=0]
│ ├── cost: 8.33562004
│ ├── fd: ()-->(2,3,10), (9)-->(11), (5)==(9), (9)==(5), (3)==(10), (10)==(3)
│ ├── project
│ │ ├── columns: "lookup_join_const_col_@12":17 ol_o_id:1 ol_d_id:2 ol_w_id:3 ol_i_id:5
│ │ ├── stats: [rows=0.821, distinct(1)=0.821, null(1)=0, distinct(2)=0.821, null(2)=0, distinct(3)=0.821, null(3)=0, distinct(5)=0.817973447, null(5)=0, distinct(17)=0.821, null(17)=0]
│ │ ├── cost: 5.02162
│ │ ├── fd: ()-->(2,3,17)
│ │ ├── scan order_line
│ │ │ ├── columns: ol_o_id:1 ol_d_id:2 ol_w_id:3 ol_i_id:5
│ │ │ ├── constraint: /6/3/2/-1/4: [/'us-west1'/1/6/3022 - /'us-west1'/1/6/3003]
│ │ │ ├── stats: [rows=0.821, distinct(1)=0.821, null(1)=0, distinct(2)=0.821, null(2)=0, distinct(3)=0.821, null(3)=0, distinct(5)=0.817973447, null(5)=0, distinct(2,3)=0.821, null(2,3)=0, distinct(1-3)=0.821, null(1-3)=0]
│ │ │ ├── cost: 4.9952
│ │ │ ├── fd: ()-->(2,3)
│ │ │ └── prune: (5)
│ │ └── projections
│ │ └── 'us-east1' [as="lookup_join_const_col_@12":17]
│ └── filters
│ ├── s_w_id:10 = 1 [outer=(10), constraints=(/10: [/1 - /1]; tight), fd=()-->(10)]
│ └── s_quantity:11 < 14 [outer=(11), constraints=(/11: (/NULL - /13]; tight)]
└── aggregations
└── count-rows [as=count_rows:15]
So the issue seems to be that CustomFuncs.findJoinFilterConstants
returns any matching filter, instead of returning the best matching filter. I don't know how to best fix this, as I don't think it's always the case that the filters returned from computedColFilters
and preferable to those returned from checkConstraintFilters
.
Should we be choosing the matching filter that results in the smallest number of constVals
? A patch like the following does seem to fix things:
diff --git a/pkg/sql/opt/xform/join_funcs.go b/pkg/sql/opt/xform/join_funcs.go
index cba1be97fa..8b43041514 100644
--- a/pkg/sql/opt/xform/join_funcs.go
+++ b/pkg/sql/opt/xform/join_funcs.go
@@ -906,13 +906,15 @@ func (c *CustomFuncs) mapInvertedJoin(
}
// findJoinFilterConstants tries to find a filter that is exactly equivalent to
-// constraining the given column to a constant value or a set of constant
-// values. If successful, the constant values and the index of the constraining
-// FiltersItem are returned. Note that the returned constant values do not
-// contain NULL.
+// constraining the given column to a constant value or a set of constant values
+// and minimizes the number of constant values. If successful, the constant
+// values and the index of the constraining FiltersItem are returned. Note that
+// the returned constant values do not contain NULL.
func (c *CustomFuncs) findJoinFilterConstants(
filters memo.FiltersExpr, col opt.ColumnID,
) (values tree.Datums, filterIdx int, ok bool) {
+ var bestValues tree.Datums
+ var bestFilterIdx int
for filterIdx := range filters {
props := filters[filterIdx].ScalarProps()
if props.TightConstraints {
@@ -927,12 +929,16 @@ func (c *CustomFuncs) findJoinFilterConstants(
break
}
}
- if !hasNull {
- return constVals, filterIdx, true
+ if !hasNull && (bestValues == nil || len(bestValues) > len(constVals)) {
+ bestValues = constVals
+ bestFilterIdx = filterIdx
}
}
}
- return nil, -1, false
+ if bestValues == nil {
+ return nil, -1, false
+ }
+ return bestValues, bestFilterIdx, true
}
// constructJoinWithConstants constructs a cross join that joins every row in
cc. @mgartner as you appear to be the expert here.
Here's a possibly related INSERT that is slowed down by its foreign key check
Unfortunately, this did not seem to help the foreign key checks. From my reading of the EXPLAIN output, we aren't even pushing the constant non-computed columns through to the other side of that lookup join (see "lookup expression"):
root@localhost:26257/tpcc> EXPLAIN (opt, verbose) INSERT INTO history (h_c_id, h_c_d_id, h_c_w_id, h_d_id, h_w_id, h_amount, h_date, h_data) VALUES (2057, 4, 3, 4, 3, 2100.9, '2021-04-15 15:22:14', 'test data');
info
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
insert history
├── columns: <none>
├── insert-mapping:
│ ├── column21:21 => rowid:1
│ ├── column1:13 => history.h_c_id:2
│ ├── column2:14 => history.h_c_d_id:3
│ ├── column3:15 => history.h_c_w_id:4
│ ├── column4:16 => history.h_d_id:5
│ ├── column5:17 => history.h_w_id:6
│ ├── column7:19 => h_date:7
│ ├── h_amount:22 => history.h_amount:8
│ ├── column8:20 => h_data:9
│ └── column23:23 => history.crdb_region:10
├── check columns: check1:24
├── input binding: &1
├── cardinality: [0 - 0]
├── volatile, mutations
├── stats: [rows=0]
├── cost: 8.0713654
├── values
│ ├── columns: column1:13 column2:14 column3:15 column4:16 column5:17 column7:19 column8:20 column21:21 h_amount:22 column23:23 check1:24
│ ├── cardinality: [1 - 1]
│ ├── volatile
│ ├── stats: [rows=1, distinct(13)=1, null(13)=0, distinct(14)=1, null(14)=0, distinct(15)=1, null(15)=0, distinct(16)=1, null(16)=0, distinct(17)=1, null(17)=0]
│ ├── cost: 0.02
│ ├── key: ()
│ ├── fd: ()-->(13-17,19-24)
│ ├── prune: (13-17,19-24)
│ └── (2057, 4, 3, 4, 3, '2021-04-15 15:22:14', 'test data', gen_random_uuid(), 2100.9, 'us-east1', true)
└── f-k-checks
├── f-k-checks-item: history(h_c_w_id,h_c_d_id,h_c_id) -> customer(c_w_id,c_d_id,c_id)
│ └── anti-join (lookup customer)
│ ├── columns: h_c_w_id:25 h_c_d_id:26 h_c_id:27
│ ├── lookup expression
│ │ └── filters
│ │ ├── h_c_w_id:25 = c_w_id:30 [outer=(25,30), constraints=(/25: (/NULL - ]; /30: (/NULL - ]), fd=(25)==(30), (30)==(25)]
│ │ ├── h_c_d_id:26 = c_d_id:29 [outer=(26,29), constraints=(/26: (/NULL - ]; /29: (/NULL - ]), fd=(26)==(29), (29)==(26)]
│ │ ├── h_c_id:27 = c_id:28 [outer=(27,28), constraints=(/27: (/NULL - ]; /28: (/NULL - ]), fd=(27)==(28), (28)==(27)]
│ │ └── customer.crdb_region:49 IN ('europe-west2', 'us-east1', 'us-west1') [outer=(49), constraints=(/49: [/'europe-west2' - /'europe-west2'] [/'us-east1' - /'us-east1'] [/'us-west1' - /'us-west1']; tight)]
│ ├── lookup columns are key
│ ├── cardinality: [0 - 1]
│ ├── stats: [rows=1e-10]
│ ├── cost: 4.02000049
│ ├── key: ()
│ ├── fd: ()-->(25-27)
│ ├── cte-uses
│ │ └── &1: count=1 used-columns=(13-15)
│ ├── with-scan &1
│ │ ├── columns: h_c_w_id:25 h_c_d_id:26 h_c_id:27
│ │ ├── mapping:
│ │ │ ├── column3:15 => h_c_w_id:25
│ │ │ ├── column2:14 => h_c_d_id:26
│ │ │ └── column1:13 => h_c_id:27
│ │ ├── cardinality: [1 - 1]
│ │ ├── stats: [rows=1, distinct(25)=1, null(25)=0, distinct(26)=1, null(26)=0, distinct(27)=1, null(27)=0]
│ │ ├── cost: 0.01
│ │ ├── key: ()
│ │ ├── fd: ()-->(25-27)
│ │ └── cte-uses
│ │ └── &1: count=1 used-columns=(13-15)
│ └── filters (true)
└── f-k-checks-item: history(h_w_id,h_d_id) -> district(d_w_id,d_id)
└── anti-join (lookup district)
├── columns: h_w_id:52 h_d_id:53
├── lookup expression
│ └── filters
│ ├── h_w_id:52 = d_w_id:55 [outer=(52,55), constraints=(/52: (/NULL - ]; /55: (/NULL - ]), fd=(52)==(55), (55)==(52)]
│ ├── h_d_id:53 = d_id:54 [outer=(53,54), constraints=(/53: (/NULL - ]; /54: (/NULL - ]), fd=(53)==(54), (54)==(53)]
│ └── district.crdb_region:65 IN ('europe-west2', 'us-east1', 'us-west1') [outer=(65), constraints=(/65: [/'europe-west2' - /'europe-west2'] [/'us-east1' - /'us-east1'] [/'us-west1' - /'us-west1']; tight)]
├── lookup columns are key
├── cardinality: [0 - 1]
├── stats: [rows=1e-10]
├── cost: 4.02136491
├── key: ()
├── fd: ()-->(52,53)
├── cte-uses
│ └── &1: count=1 used-columns=(16,17)
├── with-scan &1
│ ├── columns: h_w_id:52 h_d_id:53
│ ├── mapping:
│ │ ├── column5:17 => h_w_id:52
│ │ └── column4:16 => h_d_id:53
│ ├── cardinality: [1 - 1]
│ ├── stats: [rows=1, distinct(52)=1, null(52)=0, distinct(53)=1, null(53)=0]
│ ├── cost: 0.01
│ ├── key: ()
│ ├── fd: ()-->(52,53)
│ └── cte-uses
│ └── &1: count=1 used-columns=(16,17)
└── filters (true)
Here's an easy way to reproduce this foreign-key behavior locally:
SET experimental_enable_unique_without_index_constraints = true;
CREATE TYPE region AS ENUM ('europe-west2','us-east1','us-west1');
CREATE TABLE district (
d_id INT8 NOT NULL,
d_w_id INT8 NOT NULL,
crdb_region region NOT VISIBLE NOT NULL AS (CASE WHEN d_w_id BETWEEN 0 AND 1665 THEN 'us-east1' WHEN d_w_id BETWEEN 1666 AND 3332 THEN 'us-west1' WHEN d_w_id BETWEEN 3333 AND 4999 THEN 'europe-west2' END) STORED,
CONSTRAINT "primary" PRIMARY KEY (crdb_region, d_w_id, d_id),
UNIQUE WITHOUT INDEX (d_w_id, d_id)
);
CREATE TABLE history (
rowid UUID NOT NULL DEFAULT gen_random_uuid(),
h_d_id INT8 NOT NULL,
h_w_id INT8 NOT NULL,
crdb_region region NOT VISIBLE NOT NULL AS (CASE WHEN h_w_id BETWEEN 0 AND 1665 THEN 'us-west1' WHEN h_w_id BETWEEN 1666 AND 3332 THEN 'europe-west2' WHEN h_w_id BETWEEN 3333 AND 4999 THEN 'us-east1' END) STORED,
CONSTRAINT "primary" PRIMARY KEY (crdb_region, h_w_id, rowid),
CONSTRAINT fk_h_w_id_ref_district FOREIGN KEY (h_w_id, h_d_id) REFERENCES district(d_w_id, d_id) NOT VALID
);
EXPLAIN (opt,verbose) INSERT INTO history (h_d_id, h_w_id) VALUES (4, 3);
info
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
insert history
├── columns: <none>
├── insert-mapping:
│ ├── column9:9 => rowid:1
│ ├── column1:7 => history.h_d_id:2
│ ├── column2:8 => history.h_w_id:3
│ └── column10:10 => history.crdb_region:4
├── check columns: check1:11
├── input binding: &1
├── cardinality: [0 - 0]
├── volatile, mutations
├── stats: [rows=0]
├── cost: 4.05639
├── values
│ ├── columns: column1:7 column2:8 column9:9 column10:10 check1:11
│ ├── cardinality: [1 - 1]
│ ├── volatile
│ ├── stats: [rows=1, distinct(7)=1, null(7)=0, distinct(8)=1, null(8)=0]
│ ├── cost: 0.02
│ ├── key: ()
│ ├── fd: ()-->(7-11)
│ ├── prune: (7-11)
│ └── (4, 3, gen_random_uuid(), 'us-west1', true)
└── f-k-checks
└── f-k-checks-item: history(h_w_id,h_d_id) -> district(d_w_id,d_id)
└── anti-join (lookup district)
├── columns: h_w_id:12 h_d_id:13
├── lookup expression
│ └── filters
│ ├── h_w_id:12 = d_w_id:15 [outer=(12,15), constraints=(/12: (/NULL - ]; /15: (/NULL - ]), fd=(12)==(15), (15)==(12)]
│ ├── h_d_id:13 = d_id:14 [outer=(13,14), constraints=(/13: (/NULL - ]; /14: (/NULL - ]), fd=(13)==(14), (14)==(13)]
│ └── district.crdb_region:16 IN ('europe-west2', 'us-east1', 'us-west1') [outer=(16), constraints=(/16: [/'europe-west2' - /'europe-west2'] [/'us-east1' - /'us-east1'] [/'us-west1' - /'us-west1']; tight)]
├── lookup columns are key
├── cardinality: [0 - 1]
├── stats: [rows=1e-10]
├── cost: 4.02639
├── key: ()
├── fd: ()-->(12,13)
├── cte-uses
│ └── &1: count=1 used-columns=(7,8)
├── with-scan &1
│ ├── columns: h_w_id:12 h_d_id:13
│ ├── mapping:
│ │ ├── column2:8 => h_w_id:12
│ │ └── column1:7 => h_d_id:13
│ ├── cardinality: [1 - 1]
│ ├── stats: [rows=1, distinct(12)=1, null(12)=0, distinct(13)=1, null(13)=0]
│ ├── cost: 0.01
│ ├── key: ()
│ ├── fd: ()-->(12,13)
│ └── cte-uses
│ └── &1: count=1 used-columns=(7,8)
└── filters (true)
Unfortunately, this did not seem to help the foreign key checks.
I take this back. After I recaptured table statistics for all tables, the foreign key lookups all seemed to properly derive tighter spans through their computed columns. Here's what the previous plan now looks like:
root@localhost:26257/tpcc> EXPLAIN (opt, verbose) INSERT INTO history (h_c_id, h_c_d_id, h_c_w_id, h_d_id, h_w_id, h_amount, h_date, h_data) VALUES (2057, 4, 3, 4, 3, 2100.9, '2021-04-15 15:22:14', 'test data');
info
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
insert history
├── columns: <none>
├── insert-mapping:
│ ├── column21:21 => rowid:1
│ ├── column1:13 => history.h_c_id:2
│ ├── column2:14 => history.h_c_d_id:3
│ ├── column3:15 => history.h_c_w_id:4
│ ├── column4:16 => history.h_d_id:5
│ ├── column5:17 => history.h_w_id:6
│ ├── column7:19 => h_date:7
│ ├── h_amount:22 => history.h_amount:8
│ ├── column8:20 => h_data:9
│ └── column23:23 => history.crdb_region:10
├── check columns: check1:24
├── input binding: &1
├── cardinality: [0 - 0]
├── volatile, mutations
├── stats: [rows=0]
├── cost: 8.60451651
├── values
│ ├── columns: column1:13 column2:14 column3:15 column4:16 column5:17 column7:19 column8:20 column21:21 h_amount:22 column23:23 check1:24
│ ├── cardinality: [1 - 1]
│ ├── volatile
│ ├── stats: [rows=1, distinct(13)=1, null(13)=0, distinct(14)=1, null(14)=0, distinct(15)=1, null(15)=0, distinct(16)=1, null(16)=0, distinct(17)=1, null(17)=0]
│ ├── cost: 0.02
│ ├── key: ()
│ ├── fd: ()-->(13-17,19-24)
│ ├── prune: (13-17,19-24)
│ └── (2057, 4, 3, 4, 3, '2021-04-15 15:22:14', 'test data', gen_random_uuid(), 2100.9, 'us-east1', true)
└── f-k-checks
├── f-k-checks-item: history(h_c_w_id,h_c_d_id,h_c_id) -> customer(c_w_id,c_d_id,c_id)
│ └── anti-join (lookup customer)
│ ├── columns: h_c_w_id:25 h_c_d_id:26 h_c_id:27
│ ├── lookup expression
│ │ └── filters
│ │ ├── h_c_w_id:25 = c_w_id:30 [outer=(25,30), constraints=(/25: (/NULL - ]; /30: (/NULL - ]), fd=(25)==(30), (30)==(25)]
│ │ ├── h_c_d_id:26 = c_d_id:29 [outer=(26,29), constraints=(/26: (/NULL - ]; /29: (/NULL - ]), fd=(26)==(29), (29)==(26)]
│ │ ├── h_c_id:27 = c_id:28 [outer=(27,28), constraints=(/27: (/NULL - ]; /28: (/NULL - ]), fd=(27)==(28), (28)==(27)]
│ │ └── customer.crdb_region:49 IN ('europe-west2', 'us-west1') [outer=(49), constraints=(/49: [/'europe-west2' - /'europe-west2'] [/'us-west1' - /'us-west1']; tight)]
│ ├── lookup columns are key
│ ├── cardinality: [0 - 1]
│ ├── stats: [rows=1e-10]
│ ├── cost: 4.32545852
│ ├── key: ()
│ ├── fd: ()-->(25-27)
│ ├── cte-uses
│ │ └── &1: count=1 used-columns=(13-15)
│ ├── anti-join (lookup customer)
│ │ ├── columns: h_c_w_id:25 h_c_d_id:26 h_c_id:27
│ │ ├── lookup expression
│ │ │ └── filters
│ │ │ ├── h_c_w_id:25 = c_w_id:30 [outer=(25,30), constraints=(/25: (/NULL - ]; /30: (/NULL - ]), fd=(25)==(30), (30)==(25)]
│ │ │ ├── h_c_d_id:26 = c_d_id:29 [outer=(26,29), constraints=(/26: (/NULL - ]; /29: (/NULL - ]), fd=(26)==(29), (29)==(26)]
│ │ │ ├── h_c_id:27 = c_id:28 [outer=(27,28), constraints=(/27: (/NULL - ]; /28: (/NULL - ]), fd=(27)==(28), (28)==(27)]
│ │ │ └── customer.crdb_region:49 = 'us-east1' [outer=(49), immutable, constraints=(/49: [/'us-east1' - /'us-east1']; tight), fd=()-->(49)]
│ │ ├── lookup columns are key
│ │ ├── cardinality: [0 - 1]
│ │ ├── stats: [rows=0.666666667, distinct(25)=0.666666667, null(25)=0, distinct(26)=0.666666667, null(26)=0, distinct(27)=0.666666667, null(27)=0]
│ │ ├── cost: 2.43233936
│ │ ├── key: ()
│ │ ├── fd: ()-->(25-27)
│ │ ├── with-scan &1
│ │ │ ├── columns: h_c_w_id:25 h_c_d_id:26 h_c_id:27
│ │ │ ├── mapping:
│ │ │ │ ├── column3:15 => h_c_w_id:25
│ │ │ │ ├── column2:14 => h_c_d_id:26
│ │ │ │ └── column1:13 => h_c_id:27
│ │ │ ├── cardinality: [1 - 1]
│ │ │ ├── stats: [rows=1, distinct(25)=1, null(25)=0, distinct(26)=1, null(26)=0, distinct(27)=1, null(27)=0]
│ │ │ ├── cost: 0.01
│ │ │ ├── key: ()
│ │ │ ├── fd: ()-->(25-27)
│ │ │ └── cte-uses
│ │ │ └── &1: count=1 used-columns=(13-15)
│ │ └── filters (true)
│ └── filters (true)
└── f-k-checks-item: history(h_w_id,h_d_id) -> district(d_w_id,d_id)
└── anti-join (lookup district)
├── columns: h_w_id:52 h_d_id:53
├── lookup expression
│ └── filters
│ ├── h_w_id:52 = d_w_id:55 [outer=(52,55), constraints=(/52: (/NULL - ]; /55: (/NULL - ]), fd=(52)==(55), (55)==(52)]
│ ├── h_d_id:53 = d_id:54 [outer=(53,54), constraints=(/53: (/NULL - ]; /54: (/NULL - ]), fd=(53)==(54), (54)==(53)]
│ └── district.crdb_region:65 IN ('europe-west2', 'us-west1') [outer=(65), constraints=(/65: [/'europe-west2' - /'europe-west2'] [/'us-west1' - /'us-west1']; tight)]
├── lookup columns are key
├── cardinality: [0 - 1]
├── stats: [rows=1e-10]
├── cost: 4.24905799
├── key: ()
├── fd: ()-->(52,53)
├── cte-uses
│ └── &1: count=1 used-columns=(16,17)
├── anti-join (lookup district)
│ ├── columns: h_w_id:52 h_d_id:53
│ ├── lookup expression
│ │ └── filters
│ │ ├── h_w_id:52 = d_w_id:55 [outer=(52,55), constraints=(/52: (/NULL - ]; /55: (/NULL - ]), fd=(52)==(55), (55)==(52)]
│ │ ├── h_d_id:53 = d_id:54 [outer=(53,54), constraints=(/53: (/NULL - ]; /54: (/NULL - ]), fd=(53)==(54), (54)==(53)]
│ │ └── district.crdb_region:65 = 'us-east1' [outer=(65), immutable, constraints=(/65: [/'us-east1' - /'us-east1']; tight), fd=()-->(65)]
│ ├── lookup columns are key
│ ├── cardinality: [0 - 1]
│ ├── stats: [rows=0.666666667, distinct(52)=0.666666667, null(52)=0, distinct(53)=0.666666667, null(53)=0]
│ ├── cost: 2.39959628
│ ├── key: ()
│ ├── fd: ()-->(52,53)
│ ├── with-scan &1
│ │ ├── columns: h_w_id:52 h_d_id:53
│ │ ├── mapping:
│ │ │ ├── column5:17 => h_w_id:52
│ │ │ └── column4:16 => h_d_id:53
│ │ ├── cardinality: [1 - 1]
│ │ ├── stats: [rows=1, distinct(52)=1, null(52)=0, distinct(53)=1, null(53)=0]
│ │ ├── cost: 0.01
│ │ ├── key: ()
│ │ ├── fd: ()-->(52,53)
│ │ └── cte-uses
│ │ └── &1: count=1 used-columns=(16,17)
│ └── filters (true)
└── filters (true)
And with that plus the patch from earlier, we get entirely region-local TPC-C transactions!
(concurrently)
➜ rp ssh nathan-1618341380-01-n12cpu4-geo:4 -- './cockroach workload run tpcc --warehouses=5000 --active-warehouses=300 --partitions=3 --partition-affinity=0 --duration=10m {pgurl:1-3}'
...
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 1274 2.1 106.0 79.7 226.5 704.6 973.1 delivery
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 12489 20.8 102.6 39.8 151.0 1946.2 2550.1 newOrder
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 1251 2.1 17.7 8.4 35.7 318.8 385.9 orderStatus
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 12786 21.3 49.6 21.0 92.3 973.1 1946.2 payment
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 1277 2.1 21.1 13.6 35.7 302.0 352.3 stockLevel
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
600.0s 0 29077 48.5 72.2 30.4 125.8 1610.6 2550.1
_elapsed_______tpmC____efc__avg(ms)__p50(ms)__p90(ms)__p95(ms)__p99(ms)_pMax(ms)
600.0s 1248.9 32.4% 102.6 39.8 67.1 151.0 1946.2 2550.1
➜ rp ssh nathan-1618341380-01-n12cpu4-geo:8 -- './cockroach workload run tpcc --warehouses=5000 --active-warehouses=300 --partitions=3 --partition-affinity=1 --duration=10m {pgurl:5-7}'
...
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 1279 2.1 131.7 83.9 201.3 1610.6 2818.6 delivery
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 12556 20.9 117.2 37.7 109.1 2684.4 3489.7 newOrder
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 1267 2.1 23.3 9.4 31.5 436.2 604.0 orderStatus
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 12642 21.1 74.1 19.9 65.0 1811.9 2080.4 payment
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 1282 2.1 48.2 16.3 71.3 973.1 1140.9 stockLevel
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
600.0s 0 29026 48.4 91.9 31.5 104.9 2281.7 3489.7
_elapsed_______tpmC____efc__avg(ms)__p50(ms)__p90(ms)__p95(ms)__p99(ms)_pMax(ms)
600.0s 1255.6 32.5% 117.2 37.7 56.6 109.1 2684.4 3489.7
➜ rp ssh nathan-1618341380-01-n12cpu4-geo:12 -- './cockroach workload run tpcc --warehouses=5000 --active-warehouses=300 --partitions=3 --partition-affinity=2 --duration=10m {pgurl:9-11}'
...
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 1276 2.1 132.2 71.3 201.3 1744.8 3087.0 delivery
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 12484 20.8 127.4 33.6 113.2 2818.6 4026.5 newOrder
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 1290 2.1 26.9 7.3 33.6 570.4 738.2 orderStatus
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 12713 21.2 61.1 16.8 65.0 1610.6 2818.6 payment
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 1275 2.1 48.4 21.0 71.3 872.4 1140.9 stockLevel
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
600.0s 0 29038 48.4 90.6 29.4 104.9 2684.4 4026.5
_elapsed_______tpmC____efc__avg(ms)__p50(ms)__p90(ms)__p95(ms)__p99(ms)_pMax(ms)
600.0s 1248.4 32.4% 127.4 33.6 60.8 113.2 2818.6 4026.5
For a combined efficiency of 32.4%+32.5%+32.4%=97.3%
.
And for reference, here is what one of those delivery
transactions looked like:
Thanks for all your work on this, @nvanbenschoten! I'll be working on this tomorrow so I'll be able to respond to all of these ideas then.
Awesome sleuthing, @nvanbenschoten!! I think your diff above to fix findJoinFilterConstants
makes perfect sense. Do you want to go ahead and submit a PR for that fix (I can also do it if you prefer)?
The second issue with the foreign key checks seems to be because we haven't costed locality optimized anti joins aggressively enough. I'm glad that recollecting stats seemed to fix the issue, but I think we want to ensure that we always choose locality optimized anti joins when possible (at least until the optimizer is more distribution-aware). I will submit a PR to fix that issue.
One question (unrelated to this issue): I noticed that in the bundle for the first query, there were no stats for the order_line
table. I know we've had issues in the past with stats not showing up for TPC-C, but ideally we should always either inject stats for TPC-C using the pre-calculated stats in workload
, or they need to be included in the backups that get loaded. Do you know what might have happened in this case?
Awesome sleuthing, @nvanbenschoten!!
+1 Thanks for getting to the bottom of this! I agree that your proposed fix looks great.
It turned out that when using an INT column instead of the ENUM column, things actually worked, which I wasn't expecting
It may not have been immediately obvious, but this was a brilliant debugging step which highlighted the issue. Perhaps confusing because the issue was less about the type of column and more about the fact that ENUMs come with an implicit CHECK constraint.
Thanks for taking a look @rytaft and @mgartner!
I think your diff above to fix findJoinFilterConstants makes perfect sense. Do you want to go ahead and submit a PR for that fix (I can also do it if you prefer)?
Sure, I'll send out the patch, along with some testing that demonstrates the impact.
The second issue with the foreign key checks seems to be because we haven't costed locality optimized anti joins aggressively enough. I'm glad that recollecting stats seemed to fix the issue, but I think we want to ensure that we always choose locality optimized anti joins when possible (at least until the optimizer is more distribution-aware). I will submit a PR to fix that issue.
Sounds great. https://github.com/cockroachdb/cockroach/pull/63830 looks like a good change.
What's interesting about this situation that I hadn't noticed before is that locality-optimized search is saving us here. I had been assuming that with fresh stats we were suddenly able to infer the single crdb_region
value in the foreign key lookup join because it's a computed expression of the warehouse column (c_w_id
for the customer
table, d_w_id
for the district
table). But that isn't what's happening. Instead, we're still fanning out to all three crdb_region
values, but locality-optimized search short-circuits before this actually results in a cross-region RPC. So it hides the latency of a fanout that we don't strictly need to do.
I'm not quite sure how to think about this. On one hand, it would be nice to derive the crdb_region
value in this case, like we are able to do in other cases. On the other hand, it's pretty neat that locality-optimized search is a general-purpose tool that can dramatically reduce the need for these kinds of specific optimizations.
Maybe we just open another issue to track the longer-term item of deriving the value of the computed column in these foreign key lookup queries? What do you think?
Do you know what might have happened in this case?
I'm not totally sure. This dataset was generated using IMPORT, so it should have used the pre-calculated stats in workload
. Perhaps those have rotted? Or they were discarded during one of the multi-region schema changes I performed. What happens to stats when a new column is added to a table?
Also, as a follow-up to all of this, I switched the database over to REGION
survivability and things still look quite good. Here's the result from one region:
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 1283 2.1 234.9 209.7 453.0 805.3 1543.5 delivery
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 12375 20.6 282.7 226.5 385.9 1946.2 2952.8 newOrder
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 1276 2.1 13.6 8.4 30.4 130.0 469.8 orderStatus
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 12630 21.0 188.1 151.0 335.5 1409.3 2684.4 payment
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
600.0s 0 1260 2.1 32.4 14.7 113.2 352.3 671.1 stockLevel
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
600.0s 0 28824 48.0 216.3 192.9 335.5 1744.8 2952.8
_elapsed_______tpmC____efc__avg(ms)__p50(ms)__p90(ms)__p95(ms)__p99(ms)_pMax(ms)
600.0s 1237.5 32.1% 282.7 226.5 260.0 385.9 1946.2 2952.8
We do see p50 latency increase by more than the replication latency (~65ms). I believe this is explained by a few replication pipeline stalls during foreign key checks on recently written rows in some of these transactions. The use of column families, which were added in https://github.com/cockroachdb/cockroach/pull/30624, is known to avoid most of these. Unfortunately, it's not easy to add column families after the fact to confirm that.
On one hand, it would be nice to derive the crdb_region value in this case, like we are able to do in other cases.... Maybe we just open another issue to track the longer-term item of deriving the value of the computed column in these foreign key lookup queries? What do you think?
Yea, let's open an issue. It's a bit tricky since the foreign key check scans the buffer from the mutation output rather than using the constant values directly, so we'd need to somehow detect cases where we can "inline" the constant values rather than scanning from the buffer. I opened #63882 -- feel free to add any additional color there.
This dataset was generated using IMPORT, so it should have used the pre-calculated stats in workload. Perhaps those have rotted? Or they were discarded during one of the multi-region schema changes I performed. What happens to stats when a new column is added to a table?
They shouldn't ever be discarded. I did a bit of poking around, and it seems like we only inject stats with workload fixtures import tpcc
, while workload init tpcc --data-loader IMPORT
doesn't do it. Have we mostly abandoned workload fixtures
at this point? I think we can get stats injected for workload init
, but we just need to set a bool
to true somewhere in workload
... I can open an issue and make this change, just let me know if you think that makes sense.
Hey @nvanbenschoten, I'm assigning this issue back to you to open the PR with your diff from above. Also let me know your thoughts about my TPC-C stats question above whenever you get a chance.
it seems like we only inject stats with workload fixtures import tpcc, while workload init tpcc --data-loader IMPORT doesn't do it. Have we mostly abandoned workload fixtures at this point? I think we can get stats injected for workload init, but we just need to set a bool to true somewhere in workload... I can open an issue and make this change, just let me know if you think that makes sense.
Good question. I think we still run workload fixtures import tpcc
in our nightly roachtests. I don't actually see any uses of workload init --data-loader
. But I also think the intention was for these two approaches towards dataset initialization to behave identically, so we probably should be injecting stats in workload init
. Opening an issue SGTM!
Done: #64580. Thanks for the info!
note: the name on this might be very off, I was just guessing
Statement diagnostics of relevant query: stmt-bundle-650217270527393794.zip
In multi-region TPC-C, we have a schema that looks like the following:
Notice that each table has a
crdb_region
column which is computed from its warehouse column (ol_w_id
fororder_line
,s_w_id
forstock
).One of the "Stock Level" transaction statements runs a join against these two tables. The statement constraints the warehouse sufficiently so that we should be able to understand that we can search locally, but we don't, so the query is executed with a remote scan that bumps it from about 10ms to about 100ms:
If we manually push the
s_w_id
through the join, things still don't improve:However, if we also manually derive the
stock
table'scrdb_region
column, things remain local to the gateway region. We can see this in both a change in query plan and a large speedup:cc. @rytaft for triage.