timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
17.84k stars 885 forks source link

drop_chunk and deadlock #2132

Open kalman5 opened 4 years ago

kalman5 commented 4 years ago

Relevant system information:

Describe the bug In some environment with a 24/24 data ingestion in case of short lived tables (like it should contain only last 24 hours) I did some experiments with hypertable sharded in two dimensions: time and external reference. The table had in total 3 external references. What I observe is that each time a drop_chunk is scheduled very often the operation dead locks, this is most likely because of access exclusive lock performed by postgresql on the referenced tables.

To Reproduce The following snippet shows the issue with vanilla postgres

create table test_a (a integer unique);
create table test_b (b integer unique);
create table test2 (a integer references test_a(a), b integer references test_b(b));
Session 1: begin;
Session 2: begin;
           select * from test_b;
Session 1: drop table test2;
Session 2: select * from test_a;

note that if multiple table have the same external refererence parallel drop_chunk deadlock from time to time deadlock as well.

mkindahl commented 4 years ago

@kalman5 Thank you for the bug report.

It is true that deadlocks can occur in a similar way to what you outline with the PG example? It is annoying that deadlocks can occur, but that is true in every database system and not specific to TimescaleDB.

I am not sure if this is issue is a question, a suggestion for improvement, or a bug report. Could you elaborate on improvements you would like or what you think need to change?

kalman5 commented 4 years ago

Imo the chunk_drop should be performed in a different light way, may be dropping the references first (if this doesn't lock the target table) and then dropping the table?

I filled this because I was asked on Slack to do so and yes this is an enhancement

EDIT: I tried the dropping constraint "trick" but it doesn't solve it because it locks the target table as well in "AccessExclusiveLock" mode.

k-rus commented 4 years ago

EDIT: I tried the dropping constraint "trick" but it doesn't solve it because it locks the target table as well in "AccessExclusiveLock" mode.

@kalman5 Thank you for checking! I guess this is what drop table does, i.e., dropping the constraints.

So I guess at least it is necessary to add a note in TimescaleDB documentation.

k-rus commented 4 years ago

Discussion at SO about this issue.

kalman5 commented 4 years ago

As required via Slack, why I'm considering the use of FK important: I think the baseline of a database is consistency and FKs are a big deal in this regard. Having a relational database but not being able to use FK to me is a poor solution, I agree that without foreign keys we have performance advantages (like this one) but I don't like to immolate on the altar of performance consistency. I believe that in situations like the one described the old approach of:

delete from t where time < now() - '1 day'::interval;

and a tuned autovacuum is a solution that just works fine. FKs are so important that PG project spent some effort with PG12 indeed partitions can now be the target for foreign keys.