Open shantanugupta-yb opened 1 year ago
Tried in simpler test:
yugabyte=# select version();
version
--------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------
PostgreSQL 11.2-YB-2.17.2.0-b0 on x86_64-pc-linux-gnu, compiled by clang version 15.0.3 (https://github.com/y
ugabyte/llvm-project.git 0b8d1183745fd3998d8beffeec8cbe99c1b20529), 64-bit
(1 row)
Time: 2.136 ms
Schema and data loading: Tried everything in colocated database:
create table test (h int, r int);
insert into test select i, i from generate_series(1, 10000) as i;
create table test_pkey (h int, r int, primary key (h asc));
insert into test_pkey select i, i from generate_series(1, 10000) as i;
create table test_idx (h int, r int);
create index on test_idx (h asc);
insert into test_idx select i, i from generate_series(1, 10000) as i;
seq_scan with on a table without pkey or index:
yb_colocated=# explain (analyze, dist) select h from test where h > 1 order by h;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------
Sort (cost=152.33..154.83 rows=1000 width=4) (actual time=23.174..24.418 rows=9999 loops=1)
Sort Key: h
Sort Method: quicksort Memory: 853kB
-> Seq Scan on test (cost=0.00..102.50 rows=1000 width=4) (actual time=2.572..20.166 rows=9999 loops=1)
Remote Filter: (h > 1)
Storage Table Read Requests: 10
Storage Table Execution Time: 19.000 ms
Planning Time: 0.043 ms
Execution Time: 25.237 ms
Storage Read Requests: 10
Storage Write Requests: 0
Storage Execution Time: 19.000 ms
Peak Memory Usage: 925 kB
(13 rows)
Index Scan with pkey:
yb_colocated=# explain (analyze, dist) select h from test_pkey where h > 1 order by h;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
----------------
Index Scan using test_pkey_pkey on test_pkey (cost=0.00..4.11 rows=1 width=4) (actual time=1.994..16.817 row
s=9999 loops=1)
Index Cond: (h > 1)
Storage Index Read Requests: 10
Storage Index Execution Time: 13.000 ms
Planning Time: 0.058 ms
Execution Time: 17.794 ms
Storage Read Requests: 10
Storage Write Requests: 0
Storage Execution Time: 13.000 ms
Peak Memory Usage: 8 kB
(10 rows)
Time: 18.576 ms
Index Only Scan with on a table without pkey and with index present :
explain (analyze, dist) select h from test_idx where h > 1 order by h;
yb_colocated=# explain (analyze, dist) select h from test_idx where h > 1 order by h;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
---------------------
Index Only Scan using test_idx_h_idx on test_idx (cost=0.00..5.12 rows=10 width=4) (actual time=2.416..19.97
9 rows=9999 loops=1)
Index Cond: (h > 1)
Heap Fetches: 0
Storage Index Read Requests: 10
Storage Index Execution Time: 15.000 ms
Planning Time: 0.058 ms
Execution Time: 20.834 ms
Storage Read Requests: 10
Storage Write Requests: 0
Storage Execution Time: 15.000 ms
Peak Memory Usage: 0 kB
(11 rows)
Time: 21.642 ms
Based on the report, it looks like the filter is passed in. cc @shantanugupta-yb
Jira Link: DB-5383
Description
In case of orderby clause with a filter condition on a key which is not indexed/primary key, YB an expression pushdown with a sequential scan on an actual table and finally performs a sort operation in query layer. Where as in case of column which is a primary key or which is an index, the filter condition/expression is not pushed down we are fetching all rows in batches of 1024. To sort 999999 rows, in case of (sequential scan+expression push down) YB issues 327 rpcs where as in case of index/indexonly scan it is issuing 977 rpcs.
sequential scan+expression push down:
index/indexonly scan:
Schema details: