When using read committed isolation (or when optimizer_use_lock_op_for_serializable is enabled) we use the new Lock operator implementation of SELECT FOR UPDATE. This Lock operator acts as an optimization barrier, which we need to prevent the locking side effect from being optimized away.
But there are some basic optimizations that we should bring back for the Lock operator. One such optimization is decorrelation of a correlated EXISTS subquery into a semi join. Consider the following query:
CREATE TABLE ab (a INT PRIMARY KEY, b INT);
CREATE TABLE c (c INT PRIMARY KEY);
SELECT * FROM ab WHERE a = 1 AND EXISTS (SELECT * FROM c WHERE c = b);
For a normal SELECT this query becomes a simple semi-join:
demo@127.0.0.1:26257/demoapp/defaultdb> EXPLAIN SELECT * FROM ab WHERE a = 1 AND EXISTS (SELECT * FROM c WHERE c = b);
info
---------------------------
distribution: local
vectorized: true
• lookup join (semi)
│ table: c@c_pkey
│ equality: (b) = (c)
│ equality cols are key
│
└── • scan
missing stats
table: ab@ab_pkey
spans: [/1 - /1]
(12 rows)
Time: 2ms total (execution 2ms / network 0ms)
And same for the old implementation of SELECT FOR UPDATE, but for the new implementation it stays an apply-join:
demo@127.0.0.1:26257/demoapp/defaultdb> EXPLAIN SELECT * FROM ab WHERE a = 1 AND EXISTS (SELECT * FROM c WHERE c = b FOR UPDATE);
info
----------------------------------
distribution: local
vectorized: true
• lookup join (semi)
│ table: c@c_pkey
│ equality: (b) = (c)
│ equality cols are key
│ locking strength: for update
│
└── • scan
missing stats
table: ab@ab_pkey
spans: [/1 - /1]
(13 rows)
Time: 2ms total (execution 2ms / network 0ms)
demo@127.0.0.1:26257/demoapp/defaultdb> SET optimizer_use_lock_op_for_serializable = true;
SET
Time: 1ms total (execution 0ms / network 0ms)
demo@127.0.0.1:26257/demoapp/defaultdb> EXPLAIN SELECT * FROM ab WHERE a = 1 AND EXISTS (SELECT * FROM c WHERE c = b FOR UPDATE);
info
---------------------------
distribution: local
vectorized: true
• apply join (semi)
│
└── • scan
missing stats
table: ab@ab_pkey
spans: [/1 - /1]
(9 rows)
Time: 1ms total (execution 1ms / network 0ms)
When using read committed isolation (or when
optimizer_use_lock_op_for_serializable
is enabled) we use the new Lock operator implementation of SELECT FOR UPDATE. This Lock operator acts as an optimization barrier, which we need to prevent the locking side effect from being optimized away.But there are some basic optimizations that we should bring back for the Lock operator. One such optimization is decorrelation of a correlated EXISTS subquery into a semi join. Consider the following query:
For a normal SELECT this query becomes a simple semi-join:
And same for the old implementation of SELECT FOR UPDATE, but for the new implementation it stays an apply-join:
Jira issue: CRDB-33647 Epic CRDB-38938