yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.94k stars 1.06k forks source link

[YSQL] Sequential scans with a filter on an actual tables are 2X faster than index/indexonly scan on an index table with a filter. #15975

Open shantanugupta-yb opened 1 year ago

shantanugupta-yb commented 1 year ago

Jira Link: DB-5383

Description

In case of orderby clause with a filter condition on a key which is not indexed/primary key, YB an expression pushdown with a sequential scan on an actual table and finally performs a sort operation in query layer. Where as in case of column which is a primary key or which is an index, the filter condition/expression is not pushed down we are fetching all rows in batches of 1024. To sort 999999 rows, in case of (sequential scan+expression push down) YB issues 327 rpcs where as in case of index/indexonly scan it is issuing 977 rpcs.

Orderby clause with 1 filter condition on non indexed column of bigint datatype 1011.96
Orderby clause with 1 filter condition on pkey of bigint datatype 1895.69
Orderby clause on with 1 filter condition secondary indexed column of bigint datatype 2283.66

sequential scan+expression push down:

yugabyte=# explain (analyse, verbose, dist) select col_bigint_id_1 from nonIndexed1M_1 where col_bigint_id_1>1 order by col_bigint_id_1 ;
                                                           QUERY PLAN                                                            
---------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=152.33..154.83 rows=1000 width=8) (actual time=787.262..879.237 rows=999999 loops=1)
   Output: col_bigint_id_1
   Sort Key: nonindexed1m_1.col_bigint_id_1
   Sort Method: external merge  Disk: 17696kB
   ->  Seq Scan on public.nonindexed1m_1  (cost=0.00..102.50 rows=1000 width=8) (actual time=2.331..526.595 rows=999999 loops=1)
         Output: col_bigint_id_1
         Remote Filter: (nonindexed1m_1.col_bigint_id_1 > 1)
         Storage Table Read Requests: 327
         Storage Table Execution Time: 438.996 ms
 Planning Time: 0.042 ms
 Execution Time: 915.294 ms
 Storage Read Requests: 327
 Storage Write Requests: 0
 Storage Execution Time: 438.996 ms
 Peak Memory Usage: 6297 kB
(15 rows)

index/indexonly scan:

yugabyte=# explain (analyse, verbose, dist) select col_bigint_id_1 from Indexed1M_1 where col_bigint_id_1>1 order by col_bigint_id_1 ;
                                                                               QUERY PLAN                                                                               
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Only Scan using indexed1m_1_col_bigint_id_1_idx on public.indexed1m_1  (cost=0.00..1129.00 rows=10000 width=8) (actual time=2.828..2410.901 rows=999999 loops=1)
   Output: col_bigint_id_1
   Index Cond: (indexed1m_1.col_bigint_id_1 > 1)
   Heap Fetches: 0
   Storage Index Read Requests: 977
   Storage Index Execution Time: 2298.981 ms
 Planning Time: 0.061 ms
 Execution Time: 2452.571 ms
 Storage Read Requests: 977
 Storage Write Requests: 0
 Storage Execution Time: 2298.981 ms
 Peak Memory Usage: 0 kB
(12 rows)

yugabyte=# explain (analyse, verbose, dist) select col_bigint_id_1 from pkeyBigint1M_1 where col_bigint_id_1>1 order by col_bigint_id_1;
                                                                    QUERY PLAN                                                                     
---------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using pkeybigint1m_1_pkey on public.pkeybigint1m_1  (cost=0.00..4.11 rows=1 width=8) (actual time=2.274..1766.727 rows=999999 loops=1)
   Output: col_bigint_id_1
   Index Cond: (pkeybigint1m_1.col_bigint_id_1 > 1)
   Storage Index Read Requests: 977
   Storage Index Execution Time: 1664.986 ms
 Planning Time: 0.058 ms
 Execution Time: 1808.370 ms
 Storage Read Requests: 977
 Storage Write Requests: 0
 Storage Execution Time: 1664.986 ms
 Peak Memory Usage: 8 kB
(11 rows)

Schema details:

sushantrmishra commented 1 year ago

Tried in simpler test:

yugabyte=# select version();
                                                                                         version

--------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------
 PostgreSQL 11.2-YB-2.17.2.0-b0 on x86_64-pc-linux-gnu, compiled by clang version 15.0.3 (https://github.com/y
ugabyte/llvm-project.git 0b8d1183745fd3998d8beffeec8cbe99c1b20529), 64-bit
(1 row)

Time: 2.136 ms

Schema and data loading: Tried everything in colocated database:

create table test (h int, r int); 
insert into test  select i, i from generate_series(1, 10000) as  i; 

create table test_pkey (h int, r int, primary key (h asc)); 
insert into test_pkey  select i, i from generate_series(1, 10000) as  i; 

create table test_idx (h int, r int); 
create index on test_idx (h asc);
insert into test_idx  select i, i from generate_series(1, 10000) as  i; 

seq_scan with on a table without pkey or index:

yb_colocated=# explain (analyze, dist) select h from test where h > 1 order by h;
                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------
 Sort  (cost=152.33..154.83 rows=1000 width=4) (actual time=23.174..24.418 rows=9999 loops=1)
   Sort Key: h
   Sort Method: quicksort  Memory: 853kB
   ->  Seq Scan on test  (cost=0.00..102.50 rows=1000 width=4) (actual time=2.572..20.166 rows=9999 loops=1)
         Remote Filter: (h > 1)
         Storage Table Read Requests: 10
         Storage Table Execution Time: 19.000 ms
 Planning Time: 0.043 ms
 Execution Time: 25.237 ms
 Storage Read Requests: 10
 Storage Write Requests: 0
 Storage Execution Time: 19.000 ms
 Peak Memory Usage: 925 kB
(13 rows)

Index Scan with pkey:

yb_colocated=# explain (analyze, dist) select h from test_pkey where h > 1 order by h;
                                                          QUERY PLAN

--------------------------------------------------------------------------------------------------------------
----------------
 Index Scan using test_pkey_pkey on test_pkey  (cost=0.00..4.11 rows=1 width=4) (actual time=1.994..16.817 row
s=9999 loops=1)
   Index Cond: (h > 1)
   Storage Index Read Requests: 10
   Storage Index Execution Time: 13.000 ms
 Planning Time: 0.058 ms
 Execution Time: 17.794 ms
 Storage Read Requests: 10
 Storage Write Requests: 0
 Storage Execution Time: 13.000 ms
 Peak Memory Usage: 8 kB
(10 rows)

Time: 18.576 ms

Index Only Scan with on a table without pkey and with index present :

explain (analyze, dist) select h from test_idx where h > 1 order by h;

yb_colocated=# explain (analyze, dist) select h from test_idx where h > 1 order by h;
                                                            QUERY PLAN

--------------------------------------------------------------------------------------------------------------
---------------------
 Index Only Scan using test_idx_h_idx on test_idx  (cost=0.00..5.12 rows=10 width=4) (actual time=2.416..19.97
9 rows=9999 loops=1)
   Index Cond: (h > 1)
   Heap Fetches: 0
   Storage Index Read Requests: 10
   Storage Index Execution Time: 15.000 ms
 Planning Time: 0.058 ms
 Execution Time: 20.834 ms
 Storage Read Requests: 10
 Storage Write Requests: 0
 Storage Execution Time: 15.000 ms
 Peak Memory Usage: 0 kB
(11 rows)

Time: 21.642 ms
rthallamko3 commented 7 months ago

Based on the report, it looks like the filter is passed in. cc @shantanugupta-yb