reorg / pg_repack

Reorganize tables in PostgreSQL databases with minimal locks
BSD 3-Clause "New" or "Revised" License
1.89k stars 174 forks source link

FREEZE'ing with pg_repack #29

Open nunks opened 10 years ago

nunks commented 10 years ago

Hi!

I've been using pg_repack for some time now as a scheduled vacuum/cluster for big tables in the small hours of a very busy OLTP database. It also happens this database has a big "transaction log" table that takes more than a day to vacuum, blocking transactions and needing to be cancelled. We then need to dump/restore this table periodically to avoid XID wraparound issues, thus stopping all transactions for six hours or so.

If pg_repack rewrites the whole table and its indexes, shouldn't it reduce the transaction ID age of the live tuples it keeps? Is it possible in any circumstances to use it as an online VACUUM FREEZE?

Thanks! Bruno

schmiddy commented 10 years ago

This is an interesting idea, and from skimming: http://www.postgresql.org/message-id/CA+TgmobNU9cfmJD8LODY5K37PGKErGdOpgcs5uee8xtXUR9Abw@mail.gmail.com

It looks like both VACUUM FULL and CLUSTER perform freezing by default now, so in theory we could as well. The devil is in the details: we rely on SQL-level CREATE TABLE AS SELECT... , INSERT, UPDATE, and DELETE commands to build our replacement table, and I doubt there's a mechanism available to us at that level to specify that our tuples should be created with frozenXID.

It would be helpful if the initial CREATE TABLE AS SELECT ... would mark the tuples as frozen by default, since the large majority of the tuples in the new table should hopefully come from that stage. Thought I remembered seeing some -hackers discussion about having CTAS do this by default. One thread here: http://www.postgresql.org/message-id/1354653480.4530.47.camel@sussancws0025

nunks commented 10 years ago

Thanks for answering so fast!

Couldn't it be just something as easy as upgrading catalog data? When you create the new table it will have its own relfrozenxid on pg_class, reflecting the amount of transactions pg_repack used to build it, which probably will be a lot less then the older original table. Maybe, when you swap the old relfilenode for the one attributed to the new table, you only need to swap its age as well.

Sorry if I'm being too simplistic, I don't feel qualified to discuss it in matters of PostgreSQL inner-workings ^_^'...