reorg / pg_repack

Reorganize tables in PostgreSQL databases with minimal locks
BSD 3-Clause "New" or "Revised" License
1.89k stars 175 forks source link

ERROR in repack.repack_swap() if CIC creates a new index on the original table after rebuild_indexes() on new table #420

Open JohnHien opened 2 months ago

JohnHien commented 2 months ago

In repack_one_table(), the rebuild_indexes() function is called to rebuild the indexes on the new table and the CREATE INDEX commands are from the index definition on the original table. What if a new index is created on the original table after the rebuild_indexes() function called? I think this new index on the original table will not be created in the new table.

To prove it, I modified the code in repack_one_table() to sleep for 20s after calling rebuild_indexes(). When the pg_repack client was sleeping, I attached to it with gdb, so the client paused and I had time to create the new index. I executed the CIC command and a new index test_a_idx was created on the original table. This is possible because only an ACCESS SHARE lock is held on the table now.

postgres=# create index concurrently on test(a);
CREATE INDEX
postgres=# \d+ test
                                          Table "public.test"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression | Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
 a      | integer |           | not null |         | plain   |             |              |
Indexes:
    "test_pkey" PRIMARY KEY, btree (a)
    "test_a_idx" btree (a)
Triggers firing always:
    repack_trigger AFTER INSERT OR DELETE OR UPDATE ON test FOR EACH ROW EXECUTE FUNCTION repack.repack_trigger('a')
Access method: heap

Then I executed 'c' to continue in gdb and I got the error in repack_swap():

ERROR: query failed: ERROR:  relation "repack.index_16577" does not exist
CONTEXT:  SQL statement "SELECT X.oid, Y.oid  FROM pg_catalog.pg_index I,       pg_catalog.pg_class X,       pg_catalog.pg_class Y WHERE I.indrelid = $1   AND I.indexrelid = X.oid   AND I.indisvalid   AND Y.oid = ('repack.index_' || X.oid)::regclass"
DETAIL: query was: SELECT repack.repack_swap($1)

repackswap() assumes that all the indexes on the original table are already created on the new table with name index, but in this case the new index test_a_idx with oid 16577 was not created on the new table because it was created after rebuild_indexes().

postgres=# select relname from pg_class where oid = 16577;
  relname
------------
 test_a_idx
(1 row)

I know this is a corner case and I am not sure whether we should fix it.

za-arthur commented 2 months ago

I think this would require refactoring of repack_one_table(). And this issue looks related to the issue https://github.com/reorg/pg_repack/issues/421.

New algorithm can look like this: 1 - get list of tables 2 - call repack_one_table()

Now the issue is that ACCESS SHARE lock (as currently) won't prevent from creating new indexes. CREATE INDEX acquires SHARE lock and CREATE INDEX CONCURRENTLY acquires SHARE UPDATE EXCLUSIVE. If pg_repack will acquire SHARE lock then it will also lock all other writes to the table (like INSERT, UPDATE), which is not acceptible If pg_repack will acquire SHARE UPDATE EXCLUSIVE lock then it seems it will prevent from both CREATE INDEX and CIC (according to the documentation):

Conflicts with the SHARE UPDATE EXCLUSIVE, SHARE, SHARE ROW EXCLUSIVE, EXCLUSIVE, and ACCESS EXCLUSIVE lock modes. This mode protects a table against concurrent schema changes and VACUUM runs.

So SHARE UPDATE EXCLUSIVE lock might be suitable lock. But it will also lock VACUUM and ANALYZE on the table.

Thoughts?

andreasscherbaum commented 2 months ago

SHARE UPDATE EXCLUSIVE is a rather big change. On first glance I don't see that it breaks other DML operations, but it certainly can break other workflows.

If we implement this, this deserves a version number change, and a longer beta test where we hear back from users if it works for them.