Open tanujnay112 opened 1 year ago
With the new cost model (#18484) we get the following plans.
yugabyte=# explain analyze SELECT b FROM sample where b = 5;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Index Only Scan using good_index on sample (cost=7.20..81.45 rows=100 width=4) (actual time=1.975..2.039 rows=100 loops=1)
Index Cond: (b = 5)
Heap Fetches: 0
Planning Time: 0.114 ms
Execution Time: 2.170 ms
Peak Memory Usage: 24 kB
(6 rows)
yugabyte=# /*+IndexScan(sample bad_index)*/explain analyze SELECT b FROM sample where b = 5;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Index Scan using bad_index on sample (cost=14.40..976.41 rows=100 width=4) (actual time=8.668..8.788 rows=100 loops=1)
Index Cond: (b = 5)
Planning Time: 0.186 ms
Execution Time: 8.994 ms
Peak Memory Usage: 24 kB
(5 rows)
The cost for the Index scan on bad_index is nearly 12 times, while execution time is only 4.4 times worse. The estimate for seeks seems accurate in the second case, while nexts are underestimated by the cost model. This may be improved upon with an ongoing tuning exercise to adjust the costs parameters used inside the cost model.
Jira Link: DB-6753
Description
Consider the following setup
We get the following output
and
Note that using the
good_index
which has a prefix of its key columns constrained operates ~3.7x faster thanbad_index
but the cost estimates are pretty much the same. In PG, the btree costing function is aware taht constraints on the prefix of an index's columns limits the scan space while our YB index costing logic does not appear to take this reality into account.Warning: Please confirm that this issue does not contain any sensitive information