yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.96k stars 1.07k forks source link

[YSQL] Lock main table rows as well for IndexOnlyScan in serializable isolation #7522

Open jaki opened 3 years ago

jaki commented 3 years ago

Jira Link: DB-1887

(Summary updated by @pkj415)

Consider the following:

create table test (h1 int, h2 int, r1 int, r2 int, v1 int, v2 int, primary key ((h1, h2), r1, r2));
create index idx on test (v1, h1);
insert into test values (1, 20, 30, 40, 5, 60);
insert into test values (1, 21, 30, 40, 5, 60);

Currently in serializable isolation, the following happens:

  1. If reading from the main table, a kStrongRead lock is taken on the longest main table prefix that covers all rows that match the where clause criteria (both existing and possible future inserts).
select * from test where v2=60; -- kStrongRead on all tablets of the main table i.e., empty prefixes on all tablets.
select * from test where h1=1 and h2=20 and r2=40; -- kStrongRead on [1, 20,,] on main table.
  1. If reading using an Index Scan, a kStrongRead lock is taken both on the longest index table prefix that covers all rows that match the where clause criteria (both existing and possible future inserts) and the "actual" matching main table rows that exist.
select * from test where h1=1 and v1=5; -- kStrongRead on [5, 1] on index table and kStrongRead on both rows of main table i.e., [1, 20, 30, 40] and [1, 21, 30, 40].

select * from test where v1=5; -- kStrongRead on [5,] on index table and kStrongRead on both rows of main table i.e., [1, 20, 30, 40] and [1, 21, 30, 40].
  1. If reading using an Index Only Scan, a kStrongRead lock is only taken on the longest index table prefix that covers all rows that match the where clause criteria (both existing and possible future inserts).
select h1 from test where v1=5; -- kStrongRead on [5,] on index table
  1. If a SELECT with an explicit row lock request such as FOR UPDATE is used, Index Only Scan is never chosen. So there are two possibilities: (a) Direct read from main table: in this case a FOR UPDATE is taken on the longest main table prefix that matches the criteria. No lock is taken on the index table.
  select * from test where v2=60 for update; -- [kStrongRead, kStrongWrite] on all tablets of the main table i.e., empty prefixes on all tablets.
  select * from test where h1=1 and h2=20 and r2=40 for update; -- [kStrongRead, kStrongWrite] on [1, 20,,] on main table.
(b) Index scan is used: in this case a `FOR UPDATE` is taken on both the longest index table prefix that satisfies the criteria and the "actual" matching main table rows that exist.
select * from test where h1=1 and v1=5 for update; -- [kStrongRead, kStrongWrite] on [5, 1] on index table and [kStrongRead, kStrongWrite] on both rows of main table i.e., [1, 20, 30, 40] and [1, 21, 30, 40].

select * from test where v1=5 for update; -- [kStrongRead, kStrongWrite] on [5,] on index table and [kStrongRead, kStrongWrite] on both rows of main table i.e., [1, 20, 30, 40] and [1, 21, 30, 40].

Add tests to assert for the above semantics.

Also, semantics of (1), (2) and (4) are for sure correct, but check if those of (3) are really correct. Should an Index Only Scan conflict with a FOR UPDATE lock on the same row in the main table via 4(a)? Currently they do not conflict. (#19036 has the same root cause).


Old github summary:

Add a test to make sure that SELECTs that would be index only scans (if it weren't for the row locking) lock not just the index but also the indexed table. Something like

Session A:

CREATE TABLE t (i int PRIMARY KEY, j int, k int, UNIQUE (j));
INSERT INTO t VALUES (1, 2, 3);
BEGIN ISOLATION LEVEL SERIALIZABLE;
EXPLAIN SELECT j FROM t WHERE j = 2;
EXPLAIN SELECT j FROM t WHERE j = 2 FOR UPDATE;
SELECT j FROM t WHERE j = 2 FOR UPDATE;

Session B:

UPDATE t SET j = 4 WHERE i = 1;

Session A:

COMMIT;

There should be a conflict, so one of the two should fail.

I suggest using ./bin/yb-ctl create --tserver_flags "TEST_docdb_log_write_batches=true,vmodule=tablet_service=2" --ysql_num_shards_per_tserver=1 and observing tserver logs to see that

Also notice how the explain switches to using IndexScan instead of IndexOnlyScan when FOR UPDATE is used.

Try snapshot isolation as well; try other row locks like FOR SHARE as well.

It would be good to do this for GIN IndexOnlyScans when that comes around.

jaki commented 3 years ago

This can probably be done in pg_mini-test.cc because other row locking tests are there.