Investigate`pg_restore` performance

erikgrinaker commented 3 weeks ago

We should add an end-to-end benchmark that measures pg_restore performance of a non-trivial dataset (e.g. 10 GB with 64 B rows). We should determine the bottlenecks and attempt to improve throughput.

We should also try concurrent restores (e.g. 4 or 8 tables).

There is an existing pg_restore benchmark as a GitHub action in _benchmarking_preparation.yml, perhaps we can use that as a starting point.

erikgrinaker commented 6 days ago

I've run some initial tests using:

s3://neon-github-dev/performance/pgdumps/tpch/tpch.pg_dump
pg_restore -d tpch --clean --if-exists --no-owner --verbose --table lineitem --section pre-data --section data tpch.pg_dump

This imports an 11 GB lineitem table, single-threaded. With fsync and S3 disabled, the Neon stack is marginally faster than vanilla Postgres using similar configurations:

Postgres: 1m44s
Neon: 1m15s

That's kind of suspect in and of itself, but given that the disk can handle 5.5 TB/s and Safekeeper+Pageserver can handle 400 MB/s, Postgres appears to be the bottleneck here -- so we need to up the concurrency.

pg_restore is basically just COPY. We can probably write a synthetic benchmark with similar behavior by generating input data for COPY FROM STDIN, combined with FREEZE and disabling constraints. This gives us better control over the ingestion data and concurrency.

erikgrinaker commented 6 days ago

I'm going to put pg_restore investigation on hold for now, as I'm able to ingest data much faster using a plain INSERT ... generate_series(). See #9789.

neondatabase / neon

Investigate`pg_restore` performance #9623