yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
9k stars 1.07k forks source link

Using "between symmetric" causing spike to inter-AZ traffic compared to using >= & < #24773

Open yugabyte-ci opened 1 week ago

yugabyte-ci commented 1 week ago

Jira Link: DB-13863

Description

When using between symmetric, there is a massive increase in inter-az traffic; the queries and the EXPLAINs are below. We could optimise this significantly or at least at a warning to the docs highlighting the increased traffic, as increased traffic in most cloud environments will result in more costs.

Queries

Slow and significant cross-AZ traffic:

select FROM outbox_events WHERE bucket_message_key = 0 and created_at between SYMMETRIC '2023-11-16 17:00:00' and '2023-11-16 18:00:00';

Our tests seem on par (if not better than >= and <), at least in our test environment.

select FROM outbox_events WHERE bucket_message_key = 0 and created_at between '2023-11-16 17:00:00' and '2023-11-16 18:00:00';

Our tests seem on par (if not better than >= and <), at least in our test environment.

select FROM outbox_events WHERE bucket_message_key = 0 and created_at >= '2023-11-16 17:00:00' and created_at < '2023-11-16 18:00:00';

Outputs of EXPLAINs

yugabyte=# Explain (analyze, dist, verbose , costs off)
select FROM outbox_events WHERE bucket_message_key = 0 and created_at between SYMMETRIC '2023-11-16 17:00:00' and '2023-11-16 18:00:00';
                                                                                                                                                                              QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using outbox_events_pkey on public.outbox_events (actual time=766.227..766.227 rows=0 loops=1)
   Index Cond: (outbox_events.bucket_message_key = 0)
   Remote Filter: (((outbox_events.created_at >= '2023-11-16 17:00:00+00'::timestamp with time zone) AND (outbox_events.created_at <= '2023-11-16 18:00:00+00'::timestamp with time zone)) OR ((outbox_events.created_at >= '2023-11-16 18:00:00+00'::timestamp with time zone) AND (outbox_events.created_at <= '2023-11-16 17:00:00+00'::timestamp with time zone)))
   Storage Table Read Requests: 1
   Storage Table Read Execution Time: 766.055 ms
   Storage Table Rows Scanned: 999990
 Planning Time: 0.088 ms
 Execution Time: 766.288 ms
 Storage Read Requests: 1
 Storage Read Execution Time: 766.055 ms
 Storage Rows Scanned: 999990
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 766.055 ms
 Peak Memory Usage: 24 kB
(17 rows)
yugabyte=# Explain (analyze, dist, verbose , costs off)
yugabyte-# select FROM outbox_events WHERE bucket_message_key = 0 and created_at between '2023-11-16 17:00:00' and '2023-11-16 18:00:00';
                                                                                                            QUERY PLAN                                                                                                            
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using outbox_events_pkey on public.outbox_events (actual time=1.536..1.536 rows=0 loops=1)
   Index Cond: ((outbox_events.bucket_message_key = 0) AND (outbox_events.created_at >= '2023-11-16 17:00:00+00'::timestamp with time zone) AND (outbox_events.created_at <= '2023-11-16 18:00:00+00'::timestamp with time zone))
   Storage Table Read Requests: 1
   Storage Table Read Execution Time: 1.427 ms
 Planning Time: 0.075 ms
 Execution Time: 1.586 ms
 Storage Read Requests: 1
 Storage Read Execution Time: 1.427 ms
 Storage Rows Scanned: 0
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 1.427 ms
 Peak Memory Usage: 24 kB
(15 rows)
yugabyte=# Explain (analyze, dist, verbose , costs off)
yugabyte-# select FROM outbox_events WHERE bucket_message_key = 0 and created_at >= '2023-11-16 17:00:00' and created_at < '2023-11-16 18:00:00';
                                                                                                           QUERY PLAN                                                                                                            
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using outbox_events_pkey on public.outbox_events (actual time=1.409..1.409 rows=0 loops=1)
   Index Cond: ((outbox_events.bucket_message_key = 0) AND (outbox_events.created_at >= '2023-11-16 17:00:00+00'::timestamp with time zone) AND (outbox_events.created_at < '2023-11-16 18:00:00+00'::timestamp with time zone))
   Storage Table Read Requests: 1
   Storage Table Read Execution Time: 1.359 ms
 Planning Time: 0.071 ms
 Execution Time: 1.459 ms
 Storage Read Requests: 1
 Storage Read Execution Time: 1.359 ms
 Storage Rows Scanned: 0
 Storage Write Requests: 0
 Catalog Read Requests: 3
 Catalog Read Execution Time: 4.017 ms
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 5.376 ms
 Peak Memory Usage: 24 kB
(16 rows)
jasonriddell commented 5 days ago

DDL for the table:

CREATE TABLE IF NOT EXISTS outbox (
      bucket_message_key smallint NOT NULL,
      id UUID,
      account_identifier TEXT,
      span_identifier TEXT,
      topic varchar(249),
      message_key TEXT,
      payload BYTEA,
      encryption_metadata TEXT,
      created_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
      PRIMARY KEY ((bucket_message_key), created_at asc, id asc)
    ) SPLIT INTO 36 TABLETS;

ALTER TABLE outbox ALTER COLUMN message_key DROP NOT NULL;

CREATE INDEX CONCURRENTLY IF NOT EXISTS outbox_id_idx ON outbox(id);
mtakahar commented 5 days ago

Reported to pgsql-bugs@lists.postgresql.org: https://www.postgresql.org/message-id/CACfbPhMXe2f81nKgX9PC0q3n4jVkHobXcUj-88h27j1zUAb0Lw%40mail.gmail.com

mtakahar commented 5 days ago

According to the response from Tom Lane, it's more involved that we previously thought:

The trouble with the LEAST/GREATEST formulation is that it may result in different semantics in situations where val1 and val2 aren't the same type. Also, LEAST/GREATEST rely on the default btree opclass for the common type, which might not match the semantics of the comparison operators that the current coding chooses.

There are ways around that --- one could be to transform to LEAST/GREATEST only when the arguments do resolve as the same type. And perhaps you could convince people that BETWEEN ought to depend on the default btree opclass not on operator names. But it's all a lot messier than you might think.