Open tsenart opened 9 years ago
Is this still true? Mesos DNS will publish unhealthy instances, even if they use Mesos native health checks in Marathon (MESOS_HTTP(S))?
I don't think anyone is working on this.
On Wed, Mar 28, 2018 at 12:14 PM, Imri Zvik notifications@github.com wrote:
Is this still true? Mesos DNS will publish unhealthy instances, even if they use Mesos native health checks in Marathon (MESOS_HTTP(S))?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mesosphere/mesos-dns/issues/310#issuecomment-376945254, or mute the thread https://github.com/notifications/unsubscribe-auth/ACPVLNIVUaP-xRKFWMVtFcrAl75Io33Jks5ti7bhgaJpZM4GKasw .
This is a really needed feature. Currently, mesos-dns will happily announce unhealthy instances, which puts the burden on figuring out the health to the client (which might need few retries to get an healthy instance).
Looking at https://github.com/mesosphere/mesos-dns/blob/master/records/state/state.go#L193 this seems to be quite simple?
The state JSON statuses
hash (same place where task state is) will contain the healthy
boolean if the task has health check configured and running. If not, it will be omitted.
So it seems a really easy fix would be to omit the record if the healthy
field is there, and is set to false.
Any thoughts about it?
I would also be glad to distinguish between "grace did not pass yet" to "no health check defined", but for now, the lack of awareness whatsoever is even worse than not distinguishing these two scenarios. If we treat missing healthy field as "healthy" (and publish such record) we keep backward compatibility by not affecting tasks without health tasks, with the trade off of publishing unhealthy instances during their grace period (which is already happening today anyway).
Bottom line is that this feature is left unanswered for years, and I bet a lot of the users of this project would wish to see it implemented, even if it is not fully covering all scenarios today (maybe add a config flag to enable/disable this).
Mesos-DNS as a service discovery system should be health-aware. This doesn't mean that it can guarantee healthiness of the returned service instances, only that it does its best to direct clients to capable ones.
With that in mind, we should take into consideration the
TaskStatus.healthy
field and work with the Marathon and Mesos teams to promote the use of Mesos native health checks.