Open ozgune opened 7 years ago
During table creation, postgres also creates 2 types for the relation, a regular type (e.g. events
) and an array type for arrays (events[]
). The actual name of the array type in pg_type
is _events
, in other words: the relation name prefixed with an underscore. When adding an underscore to a name that's already 64 characters, the last character of the name is removed.
This means creating tables with 64 character names could lead to type name collisions in the array type when creating 2 tables which have the first 63 characters in common. The way postgres solves the collision is by adding additional underscores to the second type name, and truncating it further. Before it can generate the new array type name, it acquires a lock on the existing array type in pg_type
, and if that hasn't been committed yet, CREATE TABLE
will block on this lock.
When running create_distributed_table
in a transaction block, we use a separate connection per shard placement in order to allow a COPY
or DDL command in the same transaction. However, with 64 character table names, this creates a deadlock between the different connections that are trying to create shard placements which have the first 63 characters in common (unless there are 10 or more worker nodes). This means the second CREATE TABLE
causes an array type name collision and blocks, which causes the deadlock detector to detect a self-deadlock (not sure why we sometimes get an empty error messages).
For long table names we use a shortening mechanism that truncates the table name and adds a hash of the full name before appending the shard ID. That way, we get shard names that are exactly 64 characters even if appending a shard ID would cause it to exceed 64 characters. I think we can lower the threshold such that we generate shard names that are at most 63 characters.
@marcocitus: your explanation here doesn't address what changed in 7.0… What's the transactional/connection-model change that causes this?
Trying to reason through possible approaches to this; it's a tough one.
+1
Caused a few hours of troubleshooting.. I also noticed that the WAL filled my disk when running \copies with table names >63 characters (after Citus appended shard_ids).
Partitioned tables tends to have longer names, and this becomes and issue there as well:
-- to force more than 1 connection per node
set citus.force_max_query_parallelization TO ON;
CREATE TABLE test_index_creation1
(
tenant_id integer NOT NULL,
timeperiod timestamp without time zone NOT NULL,
field1 integer NOT NULL,
inserted_utc timestamp without time zone NOT NULL DEFAULT now()
) PARTITION BY RANGE (timeperiod);
select create_distributed_table('test_index_creation1', 'tenant_id');
CREATE TABLE test_index_creation1_p2020_09_28_long_name_long_name_long PARTITION OF test_index_creation1 FOR VALUES FROM ('2020-09-28 00:00:00') TO ('2020-09-29 00:00:00');
ERROR: canceling the transaction since it was involved in a distributed deadlock
If we build a similar solution for CREATE TABLE .. PARTITION OF
and ATTACH PARTITION
commands, we can fix this issue as well.
The following code block works as expected in Citus 6.2-4:
In Citus 7.0-15, the same code block results in a transaction that hangs. The distributed deadlock detection code ultimately aborts the transaction. I see the following error on the coordinator node:
PostgreSQL truncates table names that exceed 64 characters. Citus does the same for shard names. However, we didn't change any logic in Citus 7 related to name truncation (or adding a hash into the shard name).
The issue here could be related to how Citus manages transactions and connections. The error message above is a symptom and we should look to identify its root cause.