timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
16.83k stars 852 forks source link

[Bug]: Deadlock during concurrent operations on hypertable with Foreign Key Reference #5342

Open shhnwz opened 1 year ago

shhnwz commented 1 year ago

What type of bug is this?

Locking issue

What subsystems and features are affected?

Query executor

What happened?

Cyclic lock dependency occurs whenever concurrent operations involves Hypertable [HT] having referential integrity (Foreign Key) defined with plain table [T].

Reported Customer Scenarios:

https://github.com/timescale/timescaledb/issues/4800 https://github.com/timescale/timescaledb/issues/5325

TimescaleDB version affected

2.7.0

PostgreSQL version used

14.1

What operating system did you use?

Ubuntu 20.04 x64

What installation method did you use?

Not applicable

What platform did you run on?

Managed Service for TimescaleDB (MST/Aiven)

Relevant log output and stack trace

No response

How can we reproduce the bug?

-- setup.sql

BEGIN;

-- create a referenced table
CREATE TABLE devices (
    id INT,
    name TEXT,
    PRIMARY KEY (id)
);

-- populate the referenced table
INSERT INTO devices (id, name)
    VALUES (1, 'foo');

-- create a table to be partitioned
CREATE TABLE readings (
    device_id INT,
    time TIMESTAMP WITH TIME ZONE,
    value FLOAT,
    FOREIGN KEY (device_id) REFERENCES devices (id)
);

-- convert it to a hypertable
SELECT
    create_hypertable ('readings', 'time');

-- enable compression
ALTER TABLE readings SET (timescaledb.compress, timescaledb.compress_segmentby
    = 'device_id');

-- create a chunk
INSERT INTO readings (device_id, time, value)
    VALUES (1, '2000-01-01 00:00:00+00', 42.0);

-- compress the chunk
SELECT
    compress_chunk (show_chunks ('readings'));

END;

-- deadlock.sql

-- session 1
BEGIN;

-- acquire AccessShareLock on devices
SELECT
    id
FROM
    devices
WHERE
    name = 'foo';

-- session 2
BEGIN;

-- acquire ShareRowExclusiveLock on devices, wait for AccessExclusiveLock on
-- devices, blocked by session 1
SELECT
    decompress_chunk (show_chunks ('readings'));

-- session 1
-- wait RowExclusiveLock on devices, blocked by session 2, deadlock
UPDATE
    devices
SET
    name = 'bar'
WHERE
    id = 1;
joeberetta commented 6 months ago

Hi there we've got similar issue on PostgreSQL version 15.5 TimescaleDB version v2.13.0 (latest)

LOG: process 1729351 still waiting for ShareLock on relation 18907 of database 16466 after 9997.774 ms

Any updates here?

semirke commented 3 months ago

Same here, waiting for updates. Thanks!

jdepreter commented 1 month ago

We are also experiencing this issue! Also waiting for updates.

antekresic commented 1 month ago

Hi there,

There have been recent changes with the way we handle foreign key constraints recently.

Would you mind confirming that this still an issue with the latest version 2.15? For the record, I tried the repro case locally and it seemed to not cause a deadlock with version 2.15 on PG 16

Thanks!

rnovatorov commented 1 month ago

Hi @antekresic,

Thank you for your notice.

After having tried to reproduce the issue locally with the latest version 2.15 on PG 15.7, I can gladly confirm that the instruction I originally came up with no longer causes the deadlock as decompress_chunk does not acquire ShareRowExclusiveLock on devices anymore.

As for production environment in which the issue initially occurred, unfortunately I cannot check if it's gone because we ended up dropping the constraint.

I would personally consider this one resolved.

jdepreter commented 3 days ago

Can confirm that our issue is also fixed on version 2.15.2 on PG 14.