openstreetmap / operations

OSMF Operations Working Group issue tracking
https://operations.osmfoundation.org/
98 stars 13 forks source link

Make swapping the master DB easier #120

Open zerebubuth opened 7 years ago

zerebubuth commented 7 years ago

The question of auto-failover has come up before #11, but this is about the non-auto version of that.

At the moment, flipping the master between sites is so onerous that we put off doing it #119 whenever possible.

What simple things can we do to make this an easier process which is less scary (although still manual)? Ideally, we'd be able to do this without downtime - is there a way to do this (e.g: have pgbouncer flip connections on the fly)?

tomhughes commented 7 years ago

It's such a high risk operation that I'm not sure I'd really want to try and automate - if anything goes wrong you can be stuck having to reload from backup.

zerebubuth commented 7 years ago

Let's try and figure out if there's a way of making it less risky.

To be clear, I'm not suggesting that we'd want to do this on a regular basis. But I'm worried that it's currently such an onerous and risky task that we'd delay moving to karm for two months to avoid flipping twice. This makes me think that it would be too difficult and risky to attempt if we were suffering from issues at the master database's site.

tomhughes commented 7 years ago

Well it's more like one month really - it's not like we were about to move to karm tomorrow.

tomhughes commented 7 years ago

For the record this is the process I followed last time we did it to switch the master from ramoth to katla:

So it's not actually that hard, just very nerve wracking...