Closed tomhughes closed 3 years ago
Does this plan look about right to you @joto?
Will disabling and then re-enabling the replication in the database cleanup any state from the old feed or do I need to something more to ensure we start with clean state?
I was hoping to switch from the test endpoints to the productive ones later tomorrow, which implies that both
would need to run a bit earlier so that no data gets lost. Probably not a real issue for the rest of the world, though (= low priority requirement).
Also I didn’t hear anything about a need for a new ppa that includes osmdbt 0.2.
I assume that version only contains non-essential changes (documentation and pbf output which we don’t use). If that version would be needed for any reason, it’s available here: https://launchpad.net/~mmd-osm/+archive/ubuntu/osmdbt-test4
Well the only way to switch from test to production is to wind your state back to a time a bit before the last state you got from the test feed and then deal with the overlap and I don't see how timing affects that.
Those two lines you quote are entirely about the current production feed, to make sure it goes forward cleanly with no overlap but there's no way to switch feeds without dealing with an overlap as they're not synchronised.
Well, I thought if we let both endpoints drain after the api is in read only mode, switching should be straight forward, as we have a defined (matching) state for both streams.
Well I guess I could but it means there is more work to do in the offline Window which makes it longer and I was very explicit that we did not intend to provide a way to cleanly switch back.
When I switched rhaegal back a few days ago I just rewound it an hour and let it process the overlap - that should be fine for osm2pgsql consumers though not for more finicky things.
Looks good to me.
If after "Disable replication in the database" you leave enough time for the test feeds to get drained, just re-enabling the replication should be enough to reset everything.
If you do sudo -u planet osmdbt disable-replication -c /etc/replication/osmdbt-config.yaml
, you don't need to be in the right current directory. Same for enable-replication
.
One other thing I noticed: You are pulling the master
branch from osmdbt
in cookbooks/db/recipes/base.rb
. Shall we change this to some tag so we can work on master
without it being immediately pushed into production?
Probably I should just set it to the release tag so that switching to a new version requires manual intervention but the repo looks like the releases aren't tagged?
I just tagged the v0.2
, the last release (changes after that are only test/docs).
By the way, latest PPA is based on https://github.com/openstreetmap/osmdbt/commit/c286a79b96666b7ec15931ce196b5bc93ea818b1, but that's ok as changes vs v0.2 only affect docs as @joto wrote.
In case you need to push some emergency changes as new PPA, here's a list of steps to collect, sign and upload all needed files to build a new PPA on the launchpad servers:
That's pretty much all. Launchpad.net account and signing keys need to be set up beforehand.
This has all been completed successfully.
To link it here, in case others are affected as well and search for details.
There was a report of osmupdate (version 0.4.5, latest since 2018) being incompatible with the change. I had a quick look at the source and think it relies on the state file starting with a comment to recognize it as a valid result.
https://lists.openstreetmap.org/pipermail/dev/2021-February/031072.html
@tomhughes : new build for 0.3 is underway: https://launchpad.net/~mmd-osm/+archive/ubuntu/osmdbt-test4/+packages
I think there's some new command line parameter to turn the additional "#" line on. Not sure, if @joto told you about it.
The changelog says:
Changelog
osmdbt (0.3-focal1) focal; urgency=medium
On 21st Februrary we plan the switch the minutely replication feeds to use osmdbt. The operational plan is:
sudo systemctl stop chef-client.timer chef-client.service
sudo systemctl stop replication-minutely.timer
sudo systemctl stop replication-hourly.timer
sudo systemctl stop replication-daily.timer
sudo -u planet osmdbt disable-replication
in/etc/replication
sudo rm /var/lib/replication/minute/*.done
sudo -u planet osmdbt enable-replication
in/etc/replication
sudo rm /etc/cron.d/replication-minutely
sudo rm /etc/cron.d/replication-hourly
sudo rm /etc/cron.d/replication-daily
/store/planet/replication/minute/state.txt
/store/planet/replication/hour/state.txt
/store/planet/replication/day/state.txt
sudo systemctl stop apache2.service
sudo systemctl mask apache2.service
sudo chef-client
sudo systemctl unmask apache2.service
sudo systemctl start apache2.service
Later cleanup tasks:
/store/planet/replication/test
/var/lib/replication/test
drop index nodes_xmin_idx;
drop index ways_xmin_idx;
drop index relations_xmin_idx;
drop function xid_to_int4;