[YSQL] [Parallel Query] Few index scan microbenchmarks regressed by 5X - 8X with the feature enabled

shantanugupta-yb commented 7 months ago

Description

Description Comparison is between:

Index scan workload executed without parallel workers and without analyse executed on tables Index scan workload executed without parallel workers and with analyse executed on tables Index scan workload executed with parallel workers and with analyse executed on tables Report:

Index scan microbenchmarks which have regressed: https://perf.dev.yugabyte.com/report/view/W3sibmFtZSI6ImluZCBjb2xvIGRlZmF1bHQgKyBBbmFseXNlIiwidGVzdF9pZCI6IjUwMjczMDIiLCJpc0Jhc2VsaW5lIjp0cnVlfSx7Im5hbWUiOiJpbmQgY29sbyBwYXJhbGxlbCArIEFuYWx5c2UiLCJ0ZXN0X2lkIjoiNTAyNzQwMiIsImlzQmFzZWxpbmUiOmZhbHNlfSx7Im5hbWUiOiJpbmQgY29sbyBkZWZhdWx0ICIsInRlc3RfaWQiOiI0OTg2MzAyIiwiaXNCYXNlbGluZSI6ZmFsc2V9XQ==

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

[X] I confirm this issue does not contain any sensitive information.

rthallamko3 commented 7 months ago

@andrei-mart , It looks like the cost of Parallel Index Scan (with multiple workers) is higher than regular Index scan (without workers). Do we need to tweak the cost model to not invoke parallel query feature in such cases? cc @sushantrmishra


Gather  (cost=1008.44..8921.22 rows=100 width=791) (actual time=1.760..109.041 rows=108 loops=1)
  Output: col_varchar_card1_1, col_varchar_card2_1, col_varchar_card3_1, col_bigint_id_1, col_bigint_id_2, col_bigint_id_3, col_float2_1, col_float2_2, col_float5_1, col_float5_2, col_boolean_1, col_varchar10_id_1, col_varchar100_id_1, col_varchar100_id_2, col_varchar500_id_1
  Workers Planned: 2
  Workers Launched: 2
  ->  Parallel Index Scan using pkeyvarchar10_1_pkey on public.pkeyvarchar10_1  (cost=8.44..7911.22 rows=42 width=791) (actual time=0.660..77.791 rows=36 loops=3)
        Output: col_varchar_card1_1, col_varchar_card2_1, col_varchar_card3_1, col_bigint_id_1, col_bigint_id_2, col_bigint_id_3, col_float2_1, col_float2_2, col_float5_1, col_float5_2, col_boolean_1, col_varchar10_id_1, col_varchar100_id_1, col_varchar100_id_2, col_varchar500_id_1
        Index Cond: (((pkeyvarchar10_1.col_varchar_card1_1)::text = 'id-1000000'::text) AND ((pkeyvarchar10_1.col_varchar_card3_1)::text = 'id-7000999'::text))
        Storage Table Read Requests: 110
        Storage Table Read Execution Time: 25.889 ms
        Metric rocksdb_block_cache_hit: 1938.333
        Metric rocksdb_block_cache_index_hit: 331.000
        Metric rocksdb_block_cache_filter_hit: 331.000
        Metric rocksdb_block_cache_data_hit: 1276.333
        Metric rocksdb_block_cache_bytes_read: 62960819.333
        Metric rocksdb_number_db_seek: 485.333
        Metric rocksdb_number_db_next: 762.000
        Metric rocksdb_number_db_seek_found: 485.000
        Metric rocksdb_number_db_next_found: 762.000
        Metric rocksdb_iter_bytes_read: 1100114.333
        Metric rocksdb_block_cache_multi_touch_hit: 1938.333
        Metric rocksdb_block_cache_multi_touch_bytes_read: 62960819.333
        Metric docdb_keys_found: 12.000
        Metric ql_read_latency: sum: 7712.333, count: 110.333
        Estimated Seeks: 2001
        Estimated Nexts: 4100
        Worker 0: actual time=1.087..69.671 rows=47 loops=1
        Worker 1: actual time=0.556..69.618 rows=25 loops=1
Planning Time: 0.077 ms
Execution Time: 109.136 ms
Storage Read Requests: 331
Storage Read Execution Time: 77.667 ms
Storage Write Requests: 0
Catalog Read Requests: 0
Catalog Write Requests: 0
Storage Flush Requests: 0
Metric rocksdb_block_cache_hit: 5815
Metric rocksdb_block_cache_index_hit: 993
Metric rocksdb_block_cache_filter_hit: 993
Metric rocksdb_block_cache_data_hit: 3829
Metric rocksdb_block_cache_bytes_read: 188882458
Metric rocksdb_number_db_seek: 1456
Metric rocksdb_number_db_next: 2286
Metric rocksdb_number_db_seek_found: 1455
Metric rocksdb_number_db_next_found: 2286
Metric rocksdb_iter_bytes_read: 3300343
Metric rocksdb_block_cache_multi_touch_hit: 5815
Metric rocksdb_block_cache_multi_touch_bytes_read: 188882458
Metric docdb_keys_found: 36
Metric ql_read_latency: sum: 23137, count: 331
Storage Execution Time: 77.667 ms
Peak Memory Usage: 104 kB

rthallamko3 commented 7 months ago

From the report, it looks like PG is not using parallel workers for this query, but YB is using the parallel query feature. Do we need to make additional tweaks to our cost model? cc @andrei-mart , @sushantrmishra

rthallamko3 commented 7 months ago

cc @gauravk-in , @tverona1

shantanugupta-yb commented 2 months ago

Index scan quereis(skey) on 2.23.1.0-b43: 32 queries executed with parallel workers out of 36 queries | Report | 10 queries have regressed

Compared to 2.21.0.0-b333 I see that latencies have degraded further on 2.23.1.0-b43. Report which compares PV vs 2.23.1.0-b43(new run) vs 2.21.0.0-b333(old run)

yugabyte / yugabyte-db