neondatabase / neon

Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, code-like database branching, and scale to zero.
https://neon.tech
Apache License 2.0
14.98k stars 438 forks source link

safekeeper: increase segment size #9687

Open erikgrinaker opened 2 hours ago

erikgrinaker commented 2 hours ago

Fsync costs when closing and initializing segments significantly affect WAL ingestion throughput. Increasing the segment size would amortize these costs.

Local experiments on my MacBook show that increasing the segment size from 16 MB to 128 MB yields a 200% improvement in throughput for large appends.

erikgrinaker commented 2 hours ago

Benchmarks with 128 MB segments on a MacBook (compared to 16 MB segments):

wal_acceptor_throughput/fsync=true/commit=false/size=1024
                        time:   [12.519 s 12.546 s 12.572 s]
                        thrpt:  [81.448 MiB/s 81.618 MiB/s 81.796 MiB/s]
                 change:
                        time:   [-8.5756% -7.9787% -7.5078%] (p = 0.00 < 0.05)
                        thrpt:  [+8.1173% +8.6705% +9.3800%]
wal_acceptor_throughput/fsync=true/commit=false/size=8192
                        time:   [2.0104 s 2.0257 s 2.0420 s]
                        thrpt:  [501.48 MiB/s 505.50 MiB/s 509.34 MiB/s]
                 change:
                        time:   [-37.287% -36.477% -35.645%] (p = 0.00 < 0.05)
                        thrpt:  [+55.388% +57.423% +59.456%]
wal_acceptor_throughput/fsync=true/commit=false/size=131072
                        time:   [580.04 ms 592.50 ms 606.46 ms]
                        thrpt:  [1.6489 GiB/s 1.6878 GiB/s 1.7240 GiB/s]
                 change:
                        time:   [-68.217% -67.331% -66.340%] (p = 0.00 < 0.05)
                        thrpt:  [+197.08% +206.10% +214.63%]
wal_acceptor_throughput/fsync=true/commit=false/size=1048576
                        time:   [636.88 ms 651.40 ms 668.31 ms]
                        thrpt:  [1.4963 GiB/s 1.5352 GiB/s 1.5702 GiB/s]
                 change:
                        time:   [-69.122% -68.119% -66.988%] (p = 0.00 < 0.05)
                        thrpt:  [+202.92% +213.66% +223.86%]