mesosphere / marathon

Deploy and manage containers (including Docker) on top of Apache Mesos at scale.
https://mesosphere.github.io/marathon/
Apache License 2.0
4.07k stars 845 forks source link

Master/slave database failover question #1621

Closed trompx closed 9 years ago

trompx commented 9 years ago

Hello,

First thanks for 8.2 release ! Concerning databases cluster (master/slave), I am wondering how this could be implemented in marathon. Say I have two tasks with a service tag for automatic haproxy configuration :

If my redis master goes down, marathon will try to redeploy the task, but with a failover system, the slave will be automatically promoted to master. So the changes should be reflected in marathon so that the redis master task tries to redeploy as a slave and not as a master.

When the failover system detects the master is down, would changing the service tag (for both the master and slave) in marathon be the way to go ?

curl -XPUT
-H "Accept: application/json"
-H "Content-Type: application/json" http://marathon:8080/v2/apps/services/redismaster 
-d '{
    "env": {
        "SERVICE_NAME": "redismaster",
        "SERVICE_TAGS": "slave,redis",
    }
}'
curl -XPUT
-H "Accept: application/json"
-H "Content-Type: application/json" http://marathon:8080/v2/apps/services/redisslave 
-d '{
    "env": {
        "SERVICE_NAME": "redisslave",
        "SERVICE_TAGS": "master,redis",
    }
}'

To make this works, is there a way to put marathon on hold (prevent marathon to try to redeploy a failed task) the time the healthy slave node has been promoted to master ? Any inputs on what would be the "marathon" way to go would be great.

Thanks,

Xavier

kolloch commented 9 years ago

I am not a redis expert. Do the redis slaves forward requests to the master?

In that case it would be the Marathon way to configure load balancing to all healthy redis instances. The config should be either updated event-driven or periodically.

If you reset any environment variables or labels, Marathon will restart the app, so this is not what you want to do.

trompx commented 9 years ago

Hi,

What exactly do you mean by the slaves forward requests to the master ? I use consul + registrator so each time a container is deployed, my haproxy config gets updated.

So I guess the only way to make this works is to not set any specific parameters in the marathon task and to auto config the task when launched with a run.sh script for example that will configure the container with master/slave params accorging to what is already up (thanks to consul infos in my case) as it is not possible to change marathon config without reloading the task.

Thanks for the clarification :)

kolloch commented 9 years ago

You are welcome.