Open petuhovskiy opened 8 months ago
--enable-offload
was enabled in one staging region for a week. It helped to discover some issues (https://neondb.slack.com/archives/C033RQ5SPDH/p1720601531744029), fixes PRs are waiting for the merge. But the main issue seems to be resources overloading, would be good to limit offloading to a lower rate, to reduce the load caused by it.
After fixes will be merged, we can deploy --enable-offload
to a single prod region and verify it there.
After verification --delete-offloaded-wal
can be rolled out in staging, and then in all prod regions.
I'd say without any rush we can expect this to be rolled out everywhere in ~3 weeks.
Just talked with @arssher, it turns out pull_timeline
interferes with eviction and needs more fixes. So my estimation probably shifts for 1+ weeks into the future.
Rough roll-out plan:
--enable-offload
in staging regions, observe for ~week--enable-offload
in prod regions one by one, observe for ~week--delete-offloaded-wal
in staging regions, manually trigger uneviction--delete-offloaded-wal
in prod regions, manuall trigger uneviction