Huge deadlocks count on the last phase operations ALTER INDEX RENAME TO

aulaev commented 4 years ago

Hello, Maxim!

Thank you for very useful tool. We have deadlock errors in postgresql logs related to last phase of pgcompacttable - "REINDEX". For example:

2020-10-31 21:05:37 MSK [21303] 172.20.2.144 PostgreSQL JDBC Driver queue2@queue2 40P01 UPDATE ERROR: deadlock detected 2020-10-31 21:05:37 MSK [21303] 172.20.2.144 PostgreSQL JDBC Driver queue2@queue2 40P01 UPDATE DETAIL: Process 21303 waits for RowExclusiveLock on relation 38747785 of database 16619; blocked by process 20639. Process 20639 waits for AccessExclusiveLock on relation 39109132 of database 16619; blocked by process 21303. Process 21303: UPDATE public.queue_message2 as x SET sended_at = now(), state='SENDED' WHERE id = ( SELECT id FROM public.queue_message2 WHERE queue_id = $1 AND state = 'QUEUED' AND router_tag = $2 AND expired_at > now() ORDER BY queue_id, state, router_tag, priority DESC, queued_at ASC FOR UPDATE SKIP LOCKED LIMIT 1) RETURNING x.* Process 20639: ALTER INDEX "public".pgcompact_index_20528 RENAME TO "pgcompact_index_14090";

Can there be a positive effect from using "Options controlling the behaviour" like --reindex-*** or any else? May be --print-reindex-queries will help in such situation to perform manual reindex later?

Problem arising only on huge DML-loaded database shardes.

aulaev commented 4 years ago

+ 2020-10-31 21:05:28 MSK [20639] [local] [unknown] postgres@queue2 57014 ALTER INDEX ERROR: canceling statement due to statement timeout 2020-10-31 21:05:28 MSK [20639] [local] [unknown] postgres@queue2 57014 ALTER INDEX STATEMENT: ALTER INDEX "public".pgcompact_index_20528 RENAME TO "idx_queue_message2_main";

aulaev commented 4 years ago

I've set execution hourly by crontab: /u01/postgres/pgcompacttable-master/bin/pgcompacttable --dbname queue2 --verbose info --reindex-retry-count 5 --reindex-retry-pause 10 --reindex-lock-timeout 5000 -f

and still having errors:

[postgres@ ~]$ grep -c ERR /u01/postgres/main/log/postgresql-2020-11-01.log 26

[local]:5432 postgres@queue2=# \d+ queue_message2 Table "public.queue_message2" ... Indexes: "pk_queue_message1" PRIMARY KEY, btree (id) "pgcompact_index_28802" UNIQUE, btree (id) "idx_queue_message1_msg_id" btree (message_id) "idx_queue_message2_cnt" btree (expired_at, queue_id) WHERE state = 'QUEUED'::text "idx_queue_message2_content" btree (queue_id, state, router_tag, content_type, priority DESC, queued_at, expired_at, id) "idx_queue_message2_expired" btree (state, expired_at, sended_at, id) "idx_queue_message2_main" btree (queue_id, state, router_tag, priority DESC, queued_at, expired_at, id) "idx_queue_message2_message_id" btree (queue_id, message_id) WHERE state = 'SENDED'::text "idx_queue_message2_mt" btree (queue_id, state, router_tag, message_type, priority DESC, queued_at, expired_at, id) "pgcompact_index_14090" btree (queue_id, state, router_tag, priority DESC, queued_at, expired_at, id) "pgcompact_index_4300" btree (queue_id, state, router_tag, priority DESC, queued_at, expired_at, id) "pgcompact_index_4803" btree (queue_id, state, router_tag, priority DESC, queued_at, expired_at, id) "queue_message2_qualified_content_name_idx" btree (qualified_content_name)

It seems strange that pgcompact_index**** not removed and all of them + their "master" index (idx_queue_message2_main) lead to deadlock errors by ALTER INDEX RENAME TO operations.

aulaev commented 4 years ago

[local]:5432 postgres@queue2=# select * from pg_stat_user_indexes where relname='queue_message2'; relid | indexrelid | schemaname | relname | indexrelname | idx_scan | idx_tup_read | idx_tup_fetch -------+------------+------------+----------------+-------------------------------------------+----------+--------------+--------------- 16632 | 39485382 | public | queue_message2 | idx_queue_message2_message_id | 25 | 626576 | 0 16632 | 39485389 | public | queue_message2 | idx_queue_message2_cnt | 56 | 3689 | 0 16632 | 39485402 | public | queue_message2 | idx_queue_message1_msg_id | 683526 | 591117 | 377035 16632 | 39485403 | public | queue_message2 | pk_queue_message1 | 408 | 10 | 10 16632 | 39485405 | public | queue_message2 | pgcompact_index_28802 | 4752409 | 305091 | 272196 16632 | 39485406 | public | queue_message2 | queue_message2_qualified_content_name_idx | 0 | 0 | 0 16632 | 39485407 | public | queue_message2 | idx_queue_message2_main | 1652 | 3945 | 78 16632 | 39485408 | public | queue_message2 | pgcompact_index_14090 | 1251 | 771 | 63 16632 | 39485409 | public | queue_message2 | pgcompact_index_4300 | 1903 | 2173 | 93 16632 | 39485410 | public | queue_message2 | pgcompact_index_4803 | 3989263 | 3574874 | 318559 16632 | 39485423 | public | queue_message2 | idx_queue_message2_mt | 9799 | 5 | 2 16632 | 39485436 | public | queue_message2 | idx_queue_message2_expired | 594 | 10766611 | 17919 16632 | 39485437 | public | queue_message2 | idx_queue_message2_content | 724183 | 209192 | 77533 (13 rows)

aulaev commented 4 years ago

[postgres@ ~]$ /u01/postgres/pgcompacttable-master/bin/pgcompacttable -V /u01/postgres/pgcompacttable-master/bin/pgcompacttable v1.0.10

We're using the latest tool's version

aulaev commented 4 years ago

One more type of errors - canceling autovacuum task (arising only in the time pgcompacttable executed)

PG log: 2020-11-02 17:30:28 MSK [6018] ERROR: canceling autovacuum task 2020-11-02 17:30:28 MSK [6018] CONTEXT: automatic vacuum of table "queue2.public.queue_message2"

tool log: [Mon Nov 2 17:30:03 2020] (queue2:public.queue_message2) Set pages/round: 5. [Mon Nov 2 17:30:03 2020] (queue2:public.queue_message2) Set pages/vacuum: 626. [Mon Nov 2 17:30:28 2020] (queue2:public.queue_message2) Vacuum final: cannot clean 32 pages, 7488 pages left, duration 2.411 seconds. [Mon Nov 2 17:30:29 2020] (queue2:public.queue_message2) Analyze final: duration 0.297 second. [Mon Nov 2 17:30:29 2020] (queue2:public.queue_message2) Bloat statistics with pgstattuple: duration 0.048 seconds.

Melkij commented 4 years ago

Hello For some applications, it is important not to change the index names. To do this, we first have to build a new index (named pgcompact_index_N), then swap new and old indexes under the index rename lock (with max lock duration --reindex-lock-timeout and it's exactly reason for "canceling statement due to statement timeout" errors in PG log). When we can't rename index, we sleep for --reindex-retry-pause seconds. We trying rename again and again up to --reindex-retry-count retries. print-reindex-queries will print SQL that tool uses for reindex.

I didn't actually expect a deadlock here. I think that in this case we have no further error handling and pgcompacttable will leave the new index. I suggest checking all pgcompact_index_N indexes, most likely they are redundant and should be removed.

I've set execution hourly by crontab

I doubt this is good idea at all. For regular maintenance it would be better to tune autovacuum settings and investigate root cause of your table (or index) bloat. The usual reason is long transactions and default (means lazy) settings of autovacuum. Typically postgresql with appropriate tuning of autovacuum will not need pgcompacttable (excepts after massive deletes).

One more type of errors - canceling autovacuum task

Yes, this is expected behavior. pgcompacttable run some commands (e.g. VACUUM, ANALYZE) which definitely conflicts with autovacuum - autovacuum notices this and cancels itself with such log message.

If you have modern PostgreSQL (12 and above) - it would be better to use native REINDEX CONCURRENTLY command instead of manual indexes rename. It works with indexes in the same way as pgcompacttable, but has no concurrent locking issues during index swap. I am one who worked on this feature in postgresql core, but unfortunately still have no time to add support in pgcompacttable.

dataegret / pgcompacttable

Huge deadlocks count on the last phase operations ALTER INDEX RENAME TO #35