yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.94k stars 1.06k forks source link

[YSQL] Hash Partial Index on INT is unused if query predicate is different from index predicate #19346

Open karthik-ramanathan-3006 opened 1 year ago

karthik-ramanathan-3006 commented 1 year ago

Jira Link: DB-8148

Description

Consider a table with a hash partial index on a integer column:

CREATE TABLE foo (h INT PRIMARY KEY, v INT);
INSERT INTO foo (SELECT i, i FROM generate_series(1, 10000) AS i);
CREATE INDEX foo_v ON foo(v) WHERE v < 1024;

yugabyte=# \d foo
                Table "public.foo"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 h      | integer |           | not null |
 v      | integer |           |          |
Indexes:
    "foo_pkey" PRIMARY KEY, lsm (h HASH)
    "foo_v" lsm (v HASH) WHERE v < 1024

In a cluster with 3 t-servers, this results in foo having 6 tablets, and foo_v having 6 tablets.

Let us consider SELECT queries with a predicate on foo.v such that the predicate:

Exactly matches the Index predicate

yugabyte=# EXPLAIN (ANALYZE, DIST) SELECT * FROM foo WHERE v < 1024;
                                                     QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
 Index Scan using foo_v on foo  (cost=0.00..102.00 rows=1000 width=8) (actual time=13.654..43.841 rows=1023 loops=1)
   Storage Table Read Requests: 6
   Storage Table Read Execution Time: 27.376 ms
   Storage Index Read Requests: 6
   Storage Index Read Execution Time: 4.847 ms
   Storage Rows Scanned Requests: 4618.000
 Planning Time: 0.133 ms
 Execution Time: 44.755 ms
 Storage Read Requests: 12
 Storage Read Execution Time: 32.223 ms
 Storage Rows Scanned Requests: 4618.000
 Storage Write Requests: 0.000
 Catalog Read Requests: 0
 Catalog Write Requests: 0.000
 Storage Flush Requests: 0
 Storage Execution Time: 32.223 ms
 Peak Memory Usage: 8 kB
(17 rows)

Is a subset of the Index predicate

yugabyte=# EXPLAIN (ANALYZE, DIST) SELECT * FROM foo WHERE v < 1023;
                                              QUERY PLAN
-------------------------------------------------------------------------------------------------------
 Seq Scan on foo  (cost=0.00..102.50 rows=1000 width=8) (actual time=26.544..27.118 rows=1022 loops=1)
   Remote Filter: (v < 1023)
   Storage Table Read Requests: 1
   Storage Table Read Execution Time: 22.400 ms
   Storage Rows Scanned Requests: 10000.000
 Planning Time: 0.362 ms
 Execution Time: 27.961 ms
 Storage Read Requests: 1
 Storage Read Execution Time: 22.400 ms
 Storage Rows Scanned Requests: 10000.000
 Storage Write Requests: 0.000
 Catalog Read Requests: 0
 Catalog Write Requests: 0.000
 Storage Flush Requests: 0
 Storage Execution Time: 22.400 ms
 Peak Memory Usage: 8 kB
(16 rows)

Is a superset of the Index predicate

In summary, there are two issues observed for hash partial indexes:

Warning: Please confirm that this issue does not contain any sensitive information

sushantrmishra commented 12 months ago

Test on PG15:

[local]:5439 smishra@postgres=# select version();
                                                 version
---------------------------------------------------------------------------------------------------------
 PostgreSQL 15.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
(1 row)

Time: 20.047 ms
[local]:5439 smishra@postgres=#  EXPLAIN (ANALYZE) SELECT * FROM foo WHERE v < 1023;
                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Index Scan using foo_v on foo  (cost=0.28..46.16 rows=1022 width=8) (actual time=0.067..0.684 rows=1022 loops=1)
   Index Cond: (v < 1023)
 Planning Time: 0.207 ms
 Execution Time: 0.797 ms
(4 rows)

Time: 1.699 ms