Implement and expose ready state.

timoreimann commented 7 years ago

During deployments (only), we mark applications as ready depending on the outcome of potentially configured readiness checks.

The change goes along with some major refactoring, primarily focused towards retrieving all necessary state from a single Marathon API request thanks to the embed=apps.tasks{deployments,readiness} query parameter. This is necessary in order to retrieve a single, consistent app/task state not skewed by two API requests send away at slightly different offsets.

Additionally, we stop considering tasks as ready which have not yet reached the TASK_RUNNING state since still staging or otherwise non-ready tasks are bound to fail if taken into load-balancing rotation prematurely.

In order to leverage the new Ready state, one would have to adjust the HAProxy template. What we did was wrap the line adding a single server with something like this

{{- if .Ready }}
  <the line referenced above; see explanation below though>
{{- end }}{{/* if .Ready */}}

I didn't include the change to the template in this PR since it seems to be an individual decision what gets put into the HAProxy health check option. For instance, in our cluster, we use the readiness check path if it's available and otherwise fall back to the health check. I was unsure on whether that should be included in the PR, or if an explanation in the documentation might be preferable. Personally, I'm leaning towards the latter.

Happy to make further adjustments in this (or any other) regard if you let me know.

The PR refs #230 but supersedes the solution I described there since the readiness check takes TASK_KILLING into consideration implicitly.

coveralls commented 7 years ago

Coverage increased (+17.2%) to 56.567% when pulling bcb256a17a5be5fbcec23530a033fbfe8e29d73a on timoreimann:upstream-expose-ready-state into 61c92cadfa6b77d5e61412b845f147d2cbe9e28b on QubitProducts:master.

coveralls commented 7 years ago

Coverage increased (+17.2%) to 56.567% when pulling 767b0d12ee5da703e414a797815a4932031372ef on timoreimann:upstream-expose-ready-state into 61c92cadfa6b77d5e61412b845f147d2cbe9e28b on QubitProducts:master.

timoreimann commented 7 years ago

I had forgotten a local commit that would cause the CI to fail. It's fixed now.

j1n6 commented 7 years ago

Thank you. It's a large commit, I just need sometime to review it once I get some moment. Please give me a few days time.

The test coverage is promising :)

On May 23, 2017, at 20:11, Timo Reimann notifications@github.com wrote:

I had forgotten a local commit that would cause the CI to fail. It's fixed now.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

timoreimann commented 7 years ago

@activars sure, take your time! Let me know if I can help somehow.

Apologies for the somewhat biggish PR. Unfortunately, I had to change the two API requests to using a single request with embedded query parameters in order guarantee consistency across applications, tasks, and the related readiness checks.

mohamedhaleem commented 6 years ago

@timoreimann - could you post your configuration as reference template to use this feature ?

timoreimann commented 6 years ago

@mohamedhaleem I extended the HAProxy template in my follow-up PR #238: Have a look at this section of the haproxy_template.cfg.

In our production environment, we also configure the readiness check endpoint as the HAProxy health check, thereby allowing Marathon applications to indicate unreadiness at will (which can be desirable if, say, one of the application's dependencies is failing). It's pretty much like the vanilla template, just with the Marathon health check replaced by the readiness check. (We require our Marathon applications to always provide a readiness check.)

Hope this helps. Let me know if you have further questions.

mohamedhaleem commented 6 years ago

Thank you

From: Timo Reimann notifications@github.com<mailto:notifications@github.com> Reply-To: QubitProducts/bamboo reply@reply.github.com<mailto:reply@reply.github.com> Date: Saturday, September 9, 2017 at 3:59 PM To: QubitProducts/bamboo bamboo@noreply.github.com<mailto:bamboo@noreply.github.com> Cc: Local Administrator Mohamed_Haleem@Comcast.com<mailto:Mohamed_Haleem@Comcast.com>, Mention mention@noreply.github.com<mailto:mention@noreply.github.com> Subject: Re: [QubitProducts/bamboo] Implement and expose ready state. (#236)

@mohamedhaleemhttps://github.com/mohamedhaleem I extended the HAProxy template in my follow-up PR #238https://github.com/QubitProducts/bamboo/pull/238: Have a look at this section of the haproxy_template.cfghttps://github.com/QubitProducts/bamboo/blob/1f581208365b1650d3ac6d1d5ea84da8b1fcf864/config/haproxy_template.cfg#L73-L75.

In our production environment, we also configure the readiness check endpoint as the HAProxy health check, thereby allowing Marathon applications to indicate unreadiness at will (which can be desirable if, say, one of the application's dependencies is failing). It's pretty much like the vanilla template, just with the Marathon health check replaced by the readiness check. (We require our Marathon applications to always provide a readiness check.)

Hope this helps. Let me know if you have further questions.

- You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/QubitProducts/bamboo/pull/236#issuecomment-328300202, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AGx7onkmXGpfRjX2jt7DjTG56VxwEd1aks5sgu4agaJpZM4NTvhU.

QubitProducts / bamboo

Implement and expose ready state. #236