neondatabase / neon

Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, code-like database branching, and scale to zero.
https://neon.tech
Apache License 2.0
15.16k stars 442 forks source link

WAL redo processes leak DSM segments, i.e. files in `/dev/shm/` #9738

Open hlinnaka opened 5 days ago

hlinnaka commented 5 days ago

Each WAL redo process creates a tiny DSM segment, like /dev/shm/PostgreSQL.3449905360. They are not always (never?) cleaned up on exit, not even on pageserver restart, so a pageserver host can accumulate tens of thousands of these files over time. Each file is small, 3-5 kB, but they add up to a lot of wasted memory over time.

slack discussion: https://neondb.slack.com/archives/C033RQ5SPDH/p1731486990179549?thread_ts=1731445405.348699&cid=C033RQ5SPDH

A WAL redo process should not need DSM for anything, so let's remove the DSM initialization step altogether, so that the files are not created in the first place.

problame commented 3 hours ago

After this is fixed, rolled out, and won't be rolled back, ping

so we can do the cleanup.