t-matsuo / resource-agents

pgsql RA(ocf resource agent) for Pacemaker and PostgreSQL streaming replication. See https://github.com/t-matsuo/resource-agents/wiki
https://github.com/t-matsuo/resource-agents/wiki
GNU General Public License v2.0
118 stars 11 forks source link

PostgreSQL promotion method. #17

Closed soulhunter closed 12 years ago

soulhunter commented 12 years ago

Greetings,

I have actually some code for this, but wanted to open a discussion on it first, code is here:

https://github.com/soulhunter/resource-agents/commit/64a10f265566396ddf6f7bc6b07a5e6eaa2efe51

As you can see on that patch, I just shutdown PostgreSQL, remove recovery.conf (I could also rename it, but we recreate it as needed, so...) and start it again, the idea is to avoid a timeline switch, and almost eliminate the need for a shared archive (provided standbys are "close enough" to the newly promoted one).

I know the extra WAL shipping will work, but it has the disadvantage that it will ship the WAL files twice (once through streaming replication, and again through WAL shipping). We may have an extra cold-standby around as a WAL archive (in addition to a WAL archive on the PRI), and have the HS use that archive, but it is a more complex setup.

Using this promotion method allows you to have the archive if you want, in order to allow really lagged standbys to catch-up, but it is not that important (remember the timeline switch requires the .history file(s) to be available, and these are not streamed, so, if we allow the timeline switch, the WAL archive becomes increasingly important, even if a standby is lagged by just 1 WAL segment).

What do you think?

Ildefonso Camargo

t-matsuo commented 12 years ago

Please see https://github.com/ClusterLabs/resource-agents/pull/109

soulhunter commented 12 years ago

Ok, will continue the discussion there.