timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
17.9k stars 882 forks source link

[Bug]: Segmentation fault with time_bucket query #6883

Open pat-git023 opened 6 months ago

pat-git023 commented 6 months ago

What type of bug is this?

Crash

What subsystems and features are affected?

Query executor

What happened?

When I try to query data per month for last year the database always crashes. Statement:

SELECT time_bucket('1 month', time) AS bucket, max(value)-min(value) AS value
FROM openems_energie_erzeugung
where "time">= '2023-01-01' and "time" <= '2023-12-31' 
GROUP BY bucket;

If I change the second time in the where clause to '2023-12-14' the query executes successfully. Change to '2023-12-15' lets the database crash.

TimescaleDB version affected

timescale/timescaledb-ha:pg16 (Manifest digest: sha256:11ddc56063c5324b63ada57c38b6d974e16073ddcf15c95be1bf2e9c67955172)

PostgreSQL version used

postgres (PostgreSQL) 16.2 (Ubuntu 16.2-1.pgdg22.04+1)

What operating system did you use?

Mac OS X Sonoma 14.4.1 ARM (M1 MAX)

What installation method did you use?

Docker

What platform did you run on?

On prem/Self-hosted

Relevant log output and stack trace

2024-05-03 20:17:36 2024-05-03 20:17:36.234 CEST [1] LOG:  server process (PID 49) was terminated by signal 11: Segmentation fault
2024-05-03 20:17:36 2024-05-03 20:17:36.234 CEST [1] DETAIL:  Failed process was running: 
2024-05-03 20:17:36 
2024-05-03 20:17:36     SELECT time_bucket('1 month', time) AS bucket, max(value)-min(value) AS value
2024-05-03 20:17:36     FROM openems_energie_erzeugung
2024-05-03 20:17:36     where "time">= '2023-01-01' and "time" <= '2023-12-31' 
2024-05-03 20:17:36     GROUP BY bucket
2024-05-03 20:17:36 
2024-05-03 20:17:36 2024-05-03 20:17:36.234 CEST [1] LOG:  terminating any other active server processes
2024-05-03 20:17:36 2024-05-03 20:17:36.238 CEST [52] FATAL:  the database system is in recovery mode
2024-05-03 20:17:36 2024-05-03 20:17:36.242 CEST [1] LOG:  all server processes terminated; reinitializing
2024-05-03 20:17:36 2024-05-03 20:17:36.283 CEST [53] LOG:  database system was interrupted; last known up at 2024-05-03 20:16:48 CEST
2024-05-03 20:17:36 2024-05-03 20:17:36.284 CEST [56] FATAL:  the database system is in recovery mode
2024-05-03 20:17:36 2024-05-03 20:17:36.299 CEST [57] FATAL:  the database system is in recovery mode
2024-05-03 20:17:36 2024-05-03 20:17:36.301 CEST [58] FATAL:  the database system is in recovery mode
2024-05-03 20:17:36 2024-05-03 20:17:36.302 CEST [59] FATAL:  the database system is in recovery mode
2024-05-03 20:17:36 2024-05-03 20:17:36.304 CEST [60] FATAL:  the database system is in recovery mode
2024-05-03 20:17:36 2024-05-03 20:17:36.307 CEST [61] FATAL:  the database system is in recovery mode
2024-05-03 20:17:36 2024-05-03 20:17:36.309 CEST [62] FATAL:  the database system is in recovery mode
2024-05-03 20:17:36 2024-05-03 20:17:36.311 CEST [63] FATAL:  the database system is in recovery mode
2024-05-03 20:17:36 2024-05-03 20:17:36.313 CEST [64] FATAL:  the database system is in recovery mode
2024-05-03 20:17:36 2024-05-03 20:17:36.408 CEST [65] FATAL:  the database system is in recovery mode
2024-05-03 20:17:36 2024-05-03 20:17:36.660 CEST [66] FATAL:  the database system is in recovery mode
2024-05-03 20:17:37 2024-05-03 20:17:37.039 CEST [67] FATAL:  the database system is in recovery mode
2024-05-03 20:17:37 2024-05-03 20:17:37.208 CEST [53] LOG:  database system was not properly shut down; automatic recovery in progress
2024-05-03 20:17:37 2024-05-03 20:17:37.217 CEST [53] LOG:  redo starts at 0/9E430558
2024-05-03 20:17:37 2024-05-03 20:17:37.226 CEST [53] LOG:  invalid record length at 0/9E4726F8: expected at least 24, got 0
2024-05-03 20:17:37 2024-05-03 20:17:37.226 CEST [53] LOG:  redo done at 0/9E4726D0 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
2024-05-03 20:17:37 2024-05-03 20:17:37.237 CEST [54] LOG:  checkpoint starting: end-of-recovery immediate wait
2024-05-03 20:17:37 2024-05-03 20:17:37.268 CEST [54] LOG:  checkpoint complete: wrote 67 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.023 s, sync=0.005 s, total=0.031 s; sync files=66, longest=0.001 s, average=0.001 s; distance=264 kB, estimate=264 kB; lsn=0/9E4726F8, redo lsn=0/9E4726F8
2024-05-03 20:17:37 2024-05-03 20:17:37.275 CEST [1] LOG:  database system is ready to accept connections
2024-05-03 20:17:37 2024-05-03 20:17:37.277 CEST [70] LOG:  TimescaleDB background worker launcher connected to shared catalogs

How can we reproduce the bug?

Table is defined as follows:
openhab=# \d+ openems_energie_erzeugung
                                        Table "public.openems_energie_erzeugung"
 Column |           Type           | Collation | Nullable | Default | Storage | Compression | Stats target | Description
--------+--------------------------+-----------+----------+---------+---------+-------------+--------------+-------------
 time   | timestamp with time zone |           | not null |         | plain   |             |              |
 value  | double precision         |           |          |         | plain   |             |              |
Indexes:
    "openems_energie_erzeugung_pkey" PRIMARY KEY, btree ("time")
    "ix_production_value_time" btree (value, "time" DESC)
Triggers:
    ts_insert_blocker BEFORE INSERT ON openems_energie_erzeugung FOR EACH ROW EXECUTE FUNCTION _timescaledb_functions.insert_blocker()
Child tables: _timescaledb_internal._hyper_1_10_chunk,
              _timescaledb_internal._hyper_1_11_chunk,
              _timescaledb_internal._hyper_1_12_chunk,
              _timescaledb_internal._hyper_1_13_chunk,
              _timescaledb_internal._hyper_1_14_chunk,
              _timescaledb_internal._hyper_1_15_chunk,
              ...

Database table contains about 1.500.000 entry's like follows:
...
2023-12-14 15:10:16.126 +0100   10228.639
2023-12-14 15:11:08.684 +0100   10228.64
2023-12-14 15:12:01.245 +0100   10228.641
2023-12-14 15:28:37.017 +0100   10228.642
2023-12-15 08:32:59.939 +0100   10228.643
2023-12-15 09:01:31.893 +0100   10228.644
2023-12-15 09:07:44.885 +0100   10228.645
2023-12-15 09:12:18.418 +0100   10228.646
...

Hypertable was created like described in the Getting Started documentation:
- SELECT create_hypertable('openems_energie_erzeugung', by_range('time'));
- CREATE INDEX ix_production_value_time ON openems_energie_erzeugung (value, time DESC);
svenklemm commented 6 months ago

Which timescaledb version is that? Could you provide a self-contained producer or a stacktrace. From the provided information it is hard to reproduce the problem.