walredo spawn latency is bimodal on most pageservers: some spawns are fast, taking tens of milliseconds, others asre slow, taking multiple seconds
even though rust stdlib uses the efficient posix_spawn by default, we don't use it on pageservers because we use pre_exec() in close_fds()
DoD
walredo process spawning latency is predictable
acquisition of a walredo process for page reconstruction is < XXX milliseconds
Plan
Explore whether we can us posix_spawn; if so, ship to staging and observe whether it is a sufficient improvement. We can move the close_fds work into walredo startup, where we still trust the process.
If posix_spawn can't be used, implement a sidecar "spawner" process that pageserver asks to spawn walredo processes.
Option 1: extend the existing walredo C code to enter "template" mode.
Option 2: fork off a pagserver child process that will act as the spawner process
NB: we decide against a pool of pre-spawned walredo processes as the amoutn of CPU wasted on the inefficient fork() call is significant.
### Solve The Issue
- [ ] https://github.com/neondatabase/neon/pull/6573
- [ ] https://github.com/neondatabase/neon/pull/6574
- [ ] https://github.com/neondatabase/neon/issues/6630
- [x] measure impact in staging & prod => merge above preliminary work to get better observability
- [x] it's good, we wrote a blog post about it
but spawn still takes tens of millis, so first getpage request in a while from a running database, that is a latency spike -- hence motivation to use a pool.
Problem
On some pageservers we see >1s times to spawn the process.
Investigation Results
Customer investigation https://neondb.slack.com/archives/C033RQ5SPDH/p1706787518630459?thread_ts=1706774416.482029&cid=C033RQ5SPDH
walredo spawn latency is bimodal on most pageservers: some spawns are fast, taking tens of milliseconds, others asre slow, taking multiple seconds
even though rust stdlib uses the efficient
posix_spawn
by default, we don't use it on pageservers because we usepre_exec()
inclose_fds()
DoD
Plan
Explore whether we can us
posix_spawn
; if so, ship to staging and observe whether it is a sufficient improvement. We can move theclose_fds
work into walredo startup, where we still trust the process.If posix_spawn can't be used, implement a sidecar "spawner" process that pageserver asks to spawn walredo processes.
NB: we decide against a pool of pre-spawned walredo processes as the amoutn of CPU wasted on the inefficient
fork()
call is significant.Background Reading
Work
Spin-Offs (no need to complete before closing)