Closed nickstenning closed 7 years ago
Out of the 4 services we've migrated two away (Rabbit moved to a hosted provider, squid moved into the docker containers).
The other two services are only used by WordPress, @nickstenning is moving them to a new server today because we might not migrate WordPress to Pantheon before the maintenance happens on the db
instance. Once the new WordPress website is up an running we can just terminate the server and the work is done.
This work was accelerated due to an AWS maintenance window.
MySQL and Redis are now running on a (hopefully temporary) new wpdb
instance.
Context
At the moment we run a server called (for historical reasons)
db
, which runs the following services:RabbitMQ (for Hypothesis)(We've moved this to CloudAMQP.)Squid proxy (for Via)(We've moved this into the Docker container.)Currently, a failure of this server will:
render via unavailablecause major data consistency problems for Hypothesis (new annotations will not be added to the search index)disable "real-time" annotation updatesdisable outbound email from HypothesisIn addition, the fact that this server is a single point of failure makes it inconvenient to take it down for security updates, and as a result it is the only server in our infrastructure which does not regularly get kernel upgrades.
Discussion
Ideally, we would remove all single points of failure from the infrastructure we use for "business-critical systems" (i.e. for annotation services).
This goal will be aided by moving WordPress to Pantheon, which will eliminate MySQL and redis from the list of services which need to be replaced before
db
can be shut down.We can likely include squid in the Via docker container so that each instance talks to its own local proxy.In light of the need to accelerate this work due to an AWS maintenance window, we've done this.The last remaining service is RabbitMQ. We can either look into maintaining our own clustered RabbitMQ, or possibly moving to a managed service such as CloudAMQP.In light of the need to accelerate this work due to an AWS maintenance window, we've migrated to CloudAMQP.