yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.93k stars 1.06k forks source link

[YSQL] NL/BNL parallel plans for ORDER BY LIMIT much slower than the serial plans #24247

Open mtakahar opened 1 week ago

mtakahar commented 1 week ago

Jira Link: DB-13144

Description

Execution time:   NL BNL
serial 2427.227 ms 74.240 ms
parallel 11359.289 ms 143.954 ms
Rows returned from Index Only Scan mk:   NL BNL
serial 14289 14336
parallel 3 x 20905 3 x 20923

query:

SELECT
FROM keyword AS k,
     movie_keyword AS mk
WHERE k.keyword LIKE '%sequel%'
  AND k.id = mk.keyword_id
ORDER BY k.id LIMIT 1;

serial NL plan:

 Limit (actual time=2425.345..2425.346 rows=1 loops=1)
   ->  Nested Loop (actual time=2425.342..2425.342 rows=1 loops=1)
         ->  Index Only Scan using keyword_id_movie_keyword on movie_keyword mk (actual time=0.384..74.462 rows=14289 loops=1)
               Heap Fetches: 0
               Storage Index Read Requests: 2
               Storage Index Read Execution Time: 55.080 ms
               Storage Index Rows Scanned: 100000
         ->  Index Scan using keyword_pkey on keyword k (actual time=0.148..0.148 rows=0 loops=14289)
               Index Cond: (id = mk.keyword_id)
               Storage Filter: (keyword ~~ '%sequel%'::text)
               Storage Table Read Requests: 1
               Storage Table Read Execution Time: 0.135 ms
               Storage Table Rows Scanned: 1
 Planning Time: 10.411 ms
 Execution Time: 2427.227 ms
 Storage Read Requests: 14291
 Storage Read Execution Time: 1980.340 ms
 Storage Rows Scanned: 114289
 Storage Write Requests: 0
 Catalog Read Requests: 76
 Catalog Read Execution Time: 22.574 ms
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 2002.913 ms
 Peak Memory Usage: 25122 kB
(25 rows)

parallel NL (the storage stats in both parallel plans look incorrect: https://github.com/yugabyte/yugabyte-db/issues/24249):

 Limit (actual time=11334.783..11357.962 rows=1 loops=1)
   ->  Gather Merge (actual time=11334.782..11357.960 rows=1 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Nested Loop (actual time=5440.818..5440.819 rows=1 loops=3)
               ->  Parallel Index Only Scan using keyword_id_movie_keyword on movie_keyword mk (actual time=21.201..50.125 rows=20905 loops=3)
                     Heap Fetches: 0
                     Storage Index Read Requests: 0
                     Storage Index Read Execution Time: 10.341 ms
                     Storage Index Rows Scanned: 15733
               ->  Index Scan using keyword_pkey on keyword k (actual time=0.241..0.241 rows=0 loops=62716)
                     Index Cond: (id = mk.keyword_id)
                     Storage Filter: (keyword ~~ '%sequel%'::text)
                     Storage Table Read Requests: 0
                     Storage Table Read Execution Time: 0.036 ms
                     Storage Table Rows Scanned: 0
 Planning Time: 0.268 ms
 Execution Time: 11359.289 ms
 Storage Read Requests: 14290
 Storage Read Execution Time: 2316.823 ms
 Storage Rows Scanned: 61487
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 2316.823 ms
 Peak Memory Usage: 24907 kB
(27 rows)

serial BNL:

 Limit (actual time=73.823..73.824 rows=1 loops=1)
   ->  YB Batched Nested Loop Join (actual time=73.822..73.822 rows=1 loops=1)
         Join Filter: (k.id = mk.keyword_id)
         Sort Keys: k.id
         ->  Index Only Scan using keyword_id_movie_keyword on movie_keyword mk (actual time=0.196..56.428 rows=14336 loops=1)
               Heap Fetches: 0
               Storage Index Read Requests: 2
               Storage Index Read Execution Time: 53.222 ms
               Storage Index Rows Scanned: 100000
         ->  Index Scan using keyword_pkey on keyword k (actual time=0.270..0.270 rows=0 loops=14)
               Index Cond: (id = ANY (ARRAY[mk.keyword_id, $1, $2, ..., $1023]))
               Storage Filter: (keyword ~~ '%sequel%'::text)
               Storage Table Read Requests: 1
               Storage Table Read Execution Time: 0.206 ms
               Storage Table Rows Scanned: 29
 Planning Time: 1.248 ms
 Execution Time: 74.240 ms
 Storage Read Requests: 16
 Storage Read Execution Time: 56.113 ms
 Storage Rows Scanned: 100411
 Storage Write Requests: 0
 Catalog Read Requests: 3
 Catalog Read Execution Time: 1.277 ms
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 57.389 ms
 Peak Memory Usage: 1236 kB
(27 rows)

parallel BNL:

 Limit (actual time=127.529..143.715 rows=1 loops=1)
   ->  Gather Merge (actual time=127.528..143.712 rows=1 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  YB Batched Nested Loop Join (actual time=59.327..59.327 rows=1 loops=3)
               Join Filter: (k.id = mk.keyword_id)
               Sort Keys: k.id
               ->  Parallel Index Only Scan using keyword_id_movie_keyword on movie_keyword mk (actual time=21.011..25.934 rows=20923 loops=3)
                     Heap Fetches: 0
                     Storage Index Read Requests: 0
                     Storage Index Read Execution Time: 10.186 ms
                     Storage Index Rows Scanned: 15733
               ->  Index Scan using keyword_pkey on keyword k (actual time=0.987..0.987 rows=0 loops=62)
                     Index Cond: (id = ANY (ARRAY[mk.keyword_id, $1, $2, ..., $1023]))
                     Storage Filter: (keyword ~~ '%sequel%'::text)
                     Storage Table Read Requests: 0
                     Storage Table Read Execution Time: 0.064 ms
                     Storage Table Rows Scanned: 7
 Planning Time: 0.747 ms
 Execution Time: 143.954 ms
 Storage Read Requests: 15
 Storage Read Execution Time: 34.555 ms
 Storage Rows Scanned: 47609
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 34.555 ms
 Peak Memory Usage: 1628 kB
(29 rows)

Test case

  1. Untar testcase.tar.gz
  2. Run create_and_load.sql
  3. Run query.sql

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

mtakahar commented 1 week ago

It appears to me that the limit was not propagated to the parallel workers, but can be something else.

mtakahar commented 3 days ago

The problem not observed on pg15 based master (1e0cdcd433433e115f26f5a50d988a07f60e1d03): (disabled memoize by Set(enable_memoize off) for apple-to-apple comparison with pg11 based build).

Execution time:   NL BNL
serial 2348.253 ms 63.305 ms
parallel 2298.758 ms 64.770 ms
Rows returned from Index Only Scan mk:   NL BNL
serial 14289 14336
parallel 3 x 4763 3 x 4779

The plans:

SELECT version();
                                                             version                                                              
----------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 15.2-YB-2.25.0.0-b0 on x86_64-apple-darwin23.6.0, compiled by Apple clang version 15.0.0 (clang-1500.1.0.2.5), 64-bit
(1 row)

BEGIN;
BEGIN
SET LOCAL max_parallel_workers=0;
SET
SET LOCAL max_parallel_workers_per_gather=0;
SET
SET LOCAL yb_parallel_range_rows=0;
SET
/*+
  Leading((mk k))
  Set(enable_material off)
  Set(enable_memoize off)
  Set(yb_enable_batchednl off)
  Set(yb_bnl_batch_size 1)
  Set(enable_nestloop ON)
  Set(enable_hashjoin off)
  Set(enable_mergejoin off)
*/
EXPLAIN (ANALYZE, COSTS OFF, DIST)
SELECT
FROM keyword AS k,
     movie_keyword AS mk
WHERE k.keyword LIKE '%sequel%'
  AND k.id = mk.keyword_id
ORDER BY k.id LIMIT 1;
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Limit (actual time=2346.243..2346.244 rows=1 loops=1)
   ->  Nested Loop (actual time=2346.242..2346.242 rows=1 loops=1)
         ->  Index Only Scan using keyword_id_movie_keyword on movie_keyword mk (actual time=0.434..70.753 rows=14289 loops=1)
               Heap Fetches: 0
               Storage Index Read Requests: 2
               Storage Index Read Execution Time: 49.131 ms
               Storage Index Rows Scanned: 100000
         ->  Index Scan using keyword_pkey on keyword k (actual time=0.144..0.144 rows=0 loops=14289)
               Index Cond: (id = mk.keyword_id)
               Storage Filter: (keyword ~~ '%sequel%'::text)
               Storage Table Read Requests: 1
               Storage Table Read Execution Time: 0.130 ms
               Storage Table Rows Scanned: 1
 Planning Time: 16.545 ms
 Execution Time: 2348.253 ms
 Storage Read Requests: 14291
 Storage Read Execution Time: 1907.882 ms
 Storage Rows Scanned: 114289
 Storage Write Requests: 0
 Catalog Read Requests: 83
 Catalog Read Execution Time: 29.096 ms
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 1936.978 ms
 Peak Memory Usage: 25126 kB
(25 rows)

/*+
  Leading((mk k))
  Set(enable_material off)
  Set(enable_memoize off)
  Set(yb_enable_batchednl ON)
  Set(yb_bnl_batch_size 1024)
  Set(enable_nestloop off)
  Set(enable_hashjoin off)
  Set(enable_mergejoin off)
*/
EXPLAIN (ANALYZE, COSTS OFF, DIST)
SELECT
FROM keyword AS k,
     movie_keyword AS mk
WHERE k.keyword LIKE '%sequel%'
  AND k.id = mk.keyword_id
ORDER BY k.id LIMIT 1;
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Limit (actual time=62.830..62.831 rows=1 loops=1)
   ->  YB Batched Nested Loop Join (actual time=62.829..62.829 rows=1 loops=1)
         Join Filter: (k.id = mk.keyword_id)
         Sort Keys: k.id
         ->  Index Only Scan using keyword_id_movie_keyword on movie_keyword mk (actual time=0.274..51.976 rows=14336 loops=1)
               Heap Fetches: 0
               Storage Index Read Requests: 2
               Storage Index Read Execution Time: 48.589 ms
               Storage Index Rows Scanned: 100000
         ->  Index Scan using keyword_pkey on keyword k (actual time=0.296..0.296 rows=0 loops=14)
               Index Cond: (id = ANY (ARRAY[mk.keyword_id, $1, $2, ..., $1023]))
               Storage Filter: (keyword ~~ '%sequel%'::text)
               Storage Table Read Requests: 1
               Storage Table Read Execution Time: 0.219 ms
               Storage Table Rows Scanned: 29
 Planning Time: 1.028 ms
 Execution Time: 63.305 ms
 Storage Read Requests: 16
 Storage Read Execution Time: 51.655 ms
 Storage Rows Scanned: 100411
 Storage Write Requests: 0
 Catalog Read Requests: 3
 Catalog Read Execution Time: 1.376 ms
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 53.031 ms
 Peak Memory Usage: 1244 kB
(27 rows)

ROLLBACK;
ROLLBACK
BEGIN;
BEGIN
SET LOCAL force_parallel_mode = ON;
SET
/*+
  Leading((mk k))
  Set(enable_material off)
  Set(enable_memoize off)
  Set(yb_enable_batchednl off)
  Set(yb_bnl_batch_size 1)
  Set(enable_nestloop ON)
  Set(enable_hashjoin off)
  Set(enable_mergejoin off)
*/
EXPLAIN (ANALYZE, COSTS OFF, DIST)
SELECT
FROM keyword AS k,
     movie_keyword AS mk
WHERE k.keyword LIKE '%sequel%'
  AND k.id = mk.keyword_id
ORDER BY k.id LIMIT 1;
                                                                  QUERY PLAN                                                                  
----------------------------------------------------------------------------------------------------------------------------------------------
 Limit (actual time=2297.071..2297.158 rows=1 loops=1)
   ->  Gather Merge (actual time=2297.070..2297.157 rows=1 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Nested Loop (actual time=765.509..765.509 rows=0 loops=3)
               ->  Parallel Index Only Scan using keyword_id_movie_keyword on movie_keyword mk (actual time=17.485..24.570 rows=4763 loops=3)
                     Heap Fetches: 0
                     Storage Index Read Requests: 0
                     Storage Index Read Execution Time: 17.437 ms
                     Storage Index Rows Scanned: 33333
               ->  Index Scan using keyword_pkey on keyword k (actual time=0.141..0.141 rows=0 loops=14289)
                     Index Cond: (id = mk.keyword_id)
                     Storage Filter: (keyword ~~ '%sequel%'::text)
                     Storage Table Read Requests: 1
                     Storage Table Read Execution Time: 0.128 ms
                     Storage Table Rows Scanned: 1
 Planning Time: 0.248 ms
 Execution Time: 2298.758 ms
 Storage Read Requests: 14290
 Storage Read Execution Time: 1880.767 ms
 Storage Rows Scanned: 114289
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 1880.767 ms
 Peak Memory Usage: 25175 kB
(27 rows)

/*+
  Leading((mk k))
  Set(enable_material off)
  Set(enable_memoize off)
  Set(yb_enable_batchednl ON)
  Set(yb_bnl_batch_size 1024)
  Set(enable_nestloop off)
  Set(enable_hashjoin off)
  Set(enable_mergejoin off)
*/
EXPLAIN (ANALYZE, COSTS OFF, DIST)
SELECT
FROM keyword AS k,
     movie_keyword AS mk
WHERE k.keyword LIKE '%sequel%'
  AND k.id = mk.keyword_id
ORDER BY k.id LIMIT 1;
                                                                  QUERY PLAN                                                                  
----------------------------------------------------------------------------------------------------------------------------------------------
 Limit (actual time=64.540..64.582 rows=1 loops=1)
   ->  Gather Merge (actual time=64.539..64.581 rows=1 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  YB Batched Nested Loop Join (actual time=21.636..21.637 rows=0 loops=3)
               Join Filter: (k.id = mk.keyword_id)
               Sort Keys: k.id
               ->  Parallel Index Only Scan using keyword_id_movie_keyword on movie_keyword mk (actual time=16.611..17.773 rows=4779 loops=3)
                     Heap Fetches: 0
                     Storage Index Read Requests: 0
                     Storage Index Read Execution Time: 16.568 ms
                     Storage Index Rows Scanned: 33333
               ->  Index Scan using keyword_pkey on keyword k (actual time=0.262..0.262 rows=0 loops=14)
                     Index Cond: (id = ANY (ARRAY[mk.keyword_id, $1, $2, ..., $1023]))
                     Storage Filter: (keyword ~~ '%sequel%'::text)
                     Storage Table Read Requests: 1
                     Storage Table Read Execution Time: 0.233 ms
                     Storage Table Rows Scanned: 29
 Planning Time: 0.617 ms
 Execution Time: 64.770 ms
 Storage Read Requests: 15
 Storage Read Execution Time: 52.963 ms
 Storage Rows Scanned: 100411
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 52.963 ms
 Peak Memory Usage: 1683 kB
(29 rows)

ROLLBACK;
ROLLBACK