citusdata / citus

Distributed PostgreSQL as an extension
https://www.citusdata.com
GNU Affero General Public License v3.0
10.44k stars 662 forks source link

Use TRUNCATE + COPY FREEZE when copying a shard #7403

Open JelteF opened 9 months ago

JelteF commented 9 months ago

In src/backend/distributed/operations/worker_shard_copy.c file contains our main COPY logic for for shard moves and shard splits. After talking with @DimCitus I realized that we could use the FREEZE option of COPY to reduce the the need for heavy vacuuming after the copy is done. To benefit from the freeze option it's required to truncate the target table in the same transaction as the copy. This is fine for all our (current) use cases, because the target shard has just been created and is thus empty.

What I think is needed to achieve this:

  1. Add the FREEZE option to the COPY command that we generate in ConstructShardCopyStatement
  2. Add the FREEZE option to the list of options created in LocalCopyToShard
  3. Start a transaction in ConnectToRemoteAndStartCopy
  4. Truncate the table in ConnectToRemoteAndStartCopy (before starting the COPY)
  5. End the transaction in ShardCopyDestReceiverShutdown

Apart from the actual implementation this needs tests to see that indeed vacuum is not necessary on the new table after a shard move.

marcoslot commented 9 months ago

I wonder whether statistics gathering would be affected. It might anyway be good to do an ANALYZE as part of a move to get statistics.