[YSQL] SELECT DISTINCT using DESC key index is slow because of unnecessary backward scan

mtakahar commented 2 years ago

Jira Link: DB-3035

Description

When there is a matching index, "SELECT DISTINCT ..." may be executed with a Unique node that simply removes the duplicates from the incoming tuples returned from the index scan instead of using more expensive HashAggregate node.

In the example below, the optimizer chooses backward scan where forward scan suffices, then the query runs slower because of a known issue with backward scan (#12609).

Example:

CREATE TABLE t1 (c1 int, c2 int, c3 text);
INSERT INTO t1 SELECT i, i, lpad(i::text, 100, '#')  FROM generate_series(0, 999999) i;

CREATE INDEX i_t1_c2_desc on t1 (c2 DESC);

yugabyte=# EXPLAIN ANALYZE SELECT DISTINCT c2 FROM t1 LIMIT 100;
                                                                    QUERY PLAN                                                                    
--------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..63.25 rows=100 width=4) (actual time=1.721..3.028 rows=100 loops=1)
   ->  Unique  (cost=0.00..126.50 rows=200 width=4) (actual time=1.720..3.010 rows=100 loops=1)
         ->  Index Only Scan Backward using i_t1_c2_desc on t1  (cost=0.00..124.00 rows=1000 width=4) (actual time=1.719..2.982 rows=100 loops=1)
               Heap Fetches: 0
 Planning Time: 0.074 ms
 Execution Time: 4.231 ms
 Peak Memory Usage: 8 kB
(7 rows)

The same query runs much faster with an index with ASC key:

CREATE INDEX i_t1_c2_asc on t1 (c2 ASC);

yugabyte=# EXPLAIN ANALYZE SELECT DISTINCT c2 FROM t1 LIMIT 100;
                                                               QUERY PLAN                                                               
----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..58.25 rows=100 width=4) (actual time=2.071..2.126 rows=100 loops=1)
   ->  Unique  (cost=0.00..116.50 rows=200 width=4) (actual time=2.070..2.114 rows=100 loops=1)
         ->  Index Only Scan using i_t1_c2_asc on t1  (cost=0.00..114.00 rows=1000 width=4) (actual time=2.069..2.095 rows=100 loops=1)
               Heap Fetches: 0
 Planning Time: 0.054 ms
 Execution Time: 2.159 ms
 Peak Memory Usage: 8 kB
(7 rows)

A workaround - add ORDER BY ... DESC to the query:

yugabyte=# EXPLAIN ANALYZE SELECT DISTINCT c2 FROM t1 ORDER BY c2 DESC LIMIT 100;
                                                               QUERY PLAN                                                                
-----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..58.25 rows=100 width=4) (actual time=2.139..2.194 rows=100 loops=1)
   ->  Unique  (cost=0.00..116.50 rows=200 width=4) (actual time=2.138..2.182 rows=100 loops=1)
         ->  Index Only Scan using i_t1_c2_desc on t1  (cost=0.00..114.00 rows=1000 width=4) (actual time=2.137..2.163 rows=100 loops=1)
               Heap Fetches: 0
 Planning Time: 0.050 ms
 Execution Time: 2.227 ms
 Peak Memory Usage: 8 kB
(7 rows)

m-iancu commented 1 year ago

PG 11 seems to also force a backwards index scan (might need to disable seqscan and bitmap scan first to get planner to choose an index scan). So this might be a PG 11 vs PG 14 difference (rather than a YSQL-specific issue).

mihnea=# EXPLAIN ANALYZE SELECT DISTINCT c2 FROM t1 LIMIT 100;
                                                                      QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..5.00 rows=100 width=4) (actual time=0.017..0.088 rows=100 loops=1)
   ->  Unique  (cost=0.42..45725.43 rows=1000000 width=4) (actual time=0.017..0.072 rows=100 loops=1)
         ->  Index Only Scan Backward using i_t1_c2_desc on t1  (cost=0.42..43225.43 rows=1000000 width=4) (actual time=0.016..0.047 rows=100 loops=1)
               Heap Fetches: 100
 Planning Time: 0.065 ms
 Execution Time: 0.115 ms
(6 rows)

sushantrmishra commented 1 year ago

So there are two issues:

Plan issue: Pick the faster plan i.e forward scan vs backward scan .
backward scans are slow, which is tracked by https://github.com/yugabyte/yugabyte-db/issues/12609

This issue can be used to track 1.

tverona1 commented 1 year ago

The new cost model (beta) already takes into account the cost for backward vs forward scan. However, the issue here is that PG 11 seems to force a backwards index scan. Keeping this open to re-verify with PG15.

yugabyte / yugabyte-db

[YSQL] SELECT DISTINCT using DESC key index is slow because of unnecessary backward scan #13406

Description