QubitProducts / bamboo

HAProxy auto configuration and auto service discovery for Mesos Marathon
Apache License 2.0
793 stars 214 forks source link

When deploying new version, there is always an overlap #155

Open sergesyrota opened 9 years ago

sergesyrota commented 9 years ago

When I'm deploying a new version of the app to Mesos in a Docker container, Bamboo picks up new hosts when they are deployed, and has full pool available to route requests to, before deployment is finished. This leads to the possibility of getting bad data if you're making multiple consecutive requests and getting inconsistent data. I tried adding a health check as well, and the behavior is the same.

Marathon has minimumHealthCapacity flag. I thought it would mean that no traffic is sent to new version until this capacity is reached, and then all traffic starts going to the new version when Marathon starts killing old ones.

From what I can tell, marathon provides "version" in the event hook. Would it be possible to look for those, and make sure that requests are being forwarded to only one version at a time (if I understand correctly the meaning of this flag)?

rasputnik commented 9 years ago

I'd log an issue with the Marathon project - that was also my understanding of how it worked, but in practice it's slightly different. See this thread:

https://groups.google.com/forum/#!topic/marathon-framework/i_tg5EErtQI

(a bit old but I haven't seen any commits indicating its changed)

sergesyrota commented 9 years ago

I'm not sure that's marathon issue. They notify on every step of the way, and it seems its a subscriber's decision to figure out what to do with those events. The only thing that Marathon is doing incorrectly (IMO) or missing: identification of current production version. They change version of the app right after you post an update, but IMO there should be another indication for which version is "live" now. They should promote new version as prod just before they start killing off old version of the app. But until that's done - I think it would make sense for Bamboo to make this decision.

lclarkmichalek commented 9 years ago

Your understanding of minimumHealthCapacity is not correct. When Marathon is updating an application, it will never let the number of running tasks go below instances * minimumHealthCapacity, and never above instances * maximumOverCapacity. That means that with, say MHC = 1, and MOC = 1, Marathon will start instances of the new version immediately, whereas with MHC = 0.5 and MOC = 0, it will first kill up to 0.5 * instances of the old tasks, start as many new tasks as possible, and then repeat that process until all tasks are new.

Back to the issue at hand; this is something I've been thinking about, and there are a number of possible "correct" behaviours.

I would like to implement these behaviours, as they are important going forwards, and do require support from Bamboo itself, and not just the templates.

timoreimann commented 8 years ago

Isn't the issue raised here basically the question of how to deal with breaking API changes? If that's the case, a classical solution that does not require support from a system like Marathon or Bamboo is to use API versioning and making sure all clients run against one specific version. This can be implemented either by running one Marathon app per major API version that's uniquely addressable (through, say, a separate version path component in the URL or via content negotiation on the resource level if we're talking REST here); or by having that next generation, new-version app continue to be able to serve clients asking for the old version.

Once successfully deployed, all old clients and subsequently the old API can be turned off gradually.