yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.88k stars 1.06k forks source link

[YSQL] Foreign key check using UNIQUE index does unnecessary lookup on base table #3432

Closed ndeodhar closed 11 months ago

ndeodhar commented 4 years ago

Jira Link: DB-1927 Consider a schema like

create table base(a int PRIMARY KEY, b text);
create unique index base_idx on base(b);

create table ref(a text primary key, b references base(b));

While inserting data into table ref, YSQL will do a lookup on index base(b) to check that the row exists. However, it turns out that YSQL is doing 2 lookups: (1) Lookup in index table for b= (2) Lookup in base table for ybctid found in step (1).

Step (2) is unnecessary and should be eliminated.

It looks like FOR KEY SHARE in query results in a query plan that will include step 2 also. See the query plans for the below 2 statements with and without FOR KEY SHARE:

yugabyte=# explain SELECT 1 FROM "public"."base" x WHERE "b" OPERATOR(pg_catalog.=) '3';
                                 QUERY PLAN                                  
-----------------------------------------------------------------------------
 Index Only Scan using base_idx on base x  (cost=0.00..4.11 rows=1 width=4)
   Index Cond: (b = '3'::text)
(2 rows)

yugabyte=# explain SELECT 1 FROM "public"."base" x WHERE "b" OPERATOR(pg_catalog.=) '3' FOR KEY SHARE OF x  ;
                                  QUERY PLAN                                   
-------------------------------------------------------------------------------
 LockRows  (cost=0.00..4.12 rows=1 width=36)
   ->  Index Scan using base_idx on base x  (cost=0.00..4.11 rows=1 width=36)
         Index Cond: (b = '3'::text)

As seen, query with FOR KEY SHARE uses Index Scan whereas query without it uses Index Only Scan

sushantrmishra commented 1 year ago

yugabyte=# explain (analyze, dist) SELECT 1 FROM "public"."base" x WHERE "b" OPERATOR(pg_catalog.=) '3';

                                                      QUERY PLAN

--------------------------------------------------------------------------------------------------------------
--------
 Index Only Scan using base_idx on base x  (cost=0.00..4.01 rows=1 width=4) (actual time=0.384..0.385 rows=0 l
oops=1)
   Index Cond: (b = '3'::text)
   Heap Fetches: 0
   Storage Index Read Requests: 1
   Storage Index Execution Time: 0.000 ms
 Planning Time: 0.054 ms
 Execution Time: 0.413 ms
 Storage Read Requests: 1
 Storage Write Requests: 0
 Storage Execution Time: 0.000 ms
 Peak Memory Usage: 8 kB
(11 rows)

Time: 1.570 ms
sushantrmishra commented 11 months ago

Based on the DocDB request, we do not see Lookup in base table.

UUID of the tables:

Base table UUID -- 000033f5000030008000000000004000 
Ref table UUID -- 000033f5000030008000000000004006 
base_idx UUID -- 000033f5000030008000000000004005 

DocDB request during insert in the ref table. insert into ref values ('aaaa1', 'deadbeef');

2023-10-26 20:34:07.884 UTC [386] LOG:  statement: insert into ref values ('aaaa1', 'deadbeef');
I1026 20:34:07.885385   386 pg_session.cc:272] Buffering operation: { WRITE active: 1 read_time: { read: <invalid> local_limit: <invalid> global_limit: <invalid> in_txn_limit: <invalid> serial_no: 0 } request: client: YQL_CLIENT_PGSQL stmt_id: 7299273507072 stmt_type: PGSQL_INSERT table_id: "000033f5000030008000000000004006" schema_version: 1 hash_code: 12381 ybctid_column_value { value { binary_value: "47305D53616161613100002121" } } column_values { column_id: 11 expr { value { string_value: "deadbeef" } } } column_refs { } ysql_catalog_version: 5 partition_key: "0]" }
I1026 20:34:07.885782   386 pg_session.cc:307] Applying operation: { READ active: 1 read_time: { read: <invalid> local_limit: <invalid> global_limit: <invalid> in_txn_limit: <invalid> serial_no: 0 } request: client: YQL_CLIENT_PGSQL stmt_id: 7299282904160 schema_version: 0 targets { column_id: -8 } is_forward_scan: 1 limit: 1024 return_paging_state: 1 max_hash_code: 32767 table_id: "000033f5000030008000000000004005" ybctid_column_value { value { binary_value: "4774185364656164626565660000212421" } } row_mark_type: ROW_MARK_KEYSHARE batch_arguments { order: 0 ybctid { value { binary_value: "4774185364656164626565660000212421" } } } upper_bound { key: "8000" is_inclusive: 0 } wait_policy: WAIT_ERROR partition_key: "" metrics_capture: PGSQL_METRICS_CAPTURE_NONE size_limit: 0 }
I1026 20:34:07.885867   386 pg_session.cc:334] Flushing collected operations, using session type: kTransactional num ops: 2
sushantrmishra commented 11 months ago

For IndexScan vs IndexOnlyScan issue:

For Key share we need to lock the main table row , hence the IndexScan and without key share we can return the results from IndexOnlyScan.