Open mtakahar opened 2 years ago
PG 11 seems to also force a backwards index scan (might need to disable seqscan and bitmap scan first to get planner to choose an index scan). So this might be a PG 11 vs PG 14 difference (rather than a YSQL-specific issue).
mihnea=# EXPLAIN ANALYZE SELECT DISTINCT c2 FROM t1 LIMIT 100;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..5.00 rows=100 width=4) (actual time=0.017..0.088 rows=100 loops=1)
-> Unique (cost=0.42..45725.43 rows=1000000 width=4) (actual time=0.017..0.072 rows=100 loops=1)
-> Index Only Scan Backward using i_t1_c2_desc on t1 (cost=0.42..43225.43 rows=1000000 width=4) (actual time=0.016..0.047 rows=100 loops=1)
Heap Fetches: 100
Planning Time: 0.065 ms
Execution Time: 0.115 ms
(6 rows)
So there are two issues:
This issue can be used to track 1.
The new cost model (beta) already takes into account the cost for backward vs forward scan. However, the issue here is that PG 11 seems to force a backwards index scan. Keeping this open to re-verify with PG15.
Jira Link: DB-3035
Description
When there is a matching index, "SELECT DISTINCT ..." may be executed with a Unique node that simply removes the duplicates from the incoming tuples returned from the index scan instead of using more expensive HashAggregate node.
In the example below, the optimizer chooses backward scan where forward scan suffices, then the query runs slower because of a known issue with backward scan (#12609).
Example:
The same query runs much faster with an index with ASC key:
A workaround - add
ORDER BY ... DESC
to the query: