mesosphere / mesos-dns

DNS-based service discovery for Mesos.
https://mesosphere.github.com/mesos-dns
Apache License 2.0
484 stars 137 forks source link

proposal: mesos-dns HTTP API watches #283

Open jdef opened 8 years ago

jdef commented 8 years ago

some clients will want to know right away when service records change. it would be ideal if they could watch (ala k8s, or consul) a service for changes. this would address a major pain point of typical DNS systems -- latency between record updates when apps move across hosts (or otherwise experience a shift in IPs).

/cc @karlkfi

tsenart commented 8 years ago

While I understand the use case, I have a few remarks regarding this idea in the context of Mesos-DNS:

  1. Due to lack of better options, Mesos-DNS currently needs to periodically poll the state of the Mesos master. This inherently delays state propagation.
  2. With TTL=0 the cost of an extra-round trip to the DNS server is still incurred at request time but it effectively eliminates staleness, just like watches.
  3. Mesos-DNS is a DNS server and I'd like to keep it that way. Adding watch semantics to a DNS server over HTTP is smelling like feature creep. If we want a way to push Mesos state changes downstream, we should build that separately. In addition, at least in DCOS context, we're evaluating Consul for service discovery which, if it moves forward, would effectively provide this functionality with its Raft backed K/V store, not their DNS server component.
jdef commented 8 years ago

Some thoughts:

R(1): The delayed state propagation is a problem for DNS and HTTP/watch. This is an orthogonal issue, should be resolvable once mesos master supports event streams for non-framework clients (WIP). The nice thing about watch semantics is that as soon as state propagation happens, service record updates could be immediately sent downstream (vs. client polling (poor scale) or client caching (TTL)).

R(2): TTL=0 is fine except for (a) broken DNS clients that ignore TTL, and; (b) when your cluster is very large and clients are constantly querying the DNS system because TTL=0 and they can't effectively cache record lookups at all. watch semantics don't have this particular problem: TTL's can be non-zero.

R(3): Mesos-DNS implements an HTTP API (currently) for a reason: not all clients speak SRV but they do want access to service records. Adding support for 'watch'-style semantics isn't a huge leap to make, doesn't add a new protocol to mesos-dns (HTTP already supported), and from an API perspective could be an extension to the existing endpoints (these could accept a watch=true parameter if they wanted streaming updates).

More over, I think that it's reasonable to imagine a world wherein each slave node has a mesos-dns-agent that coalesces watches so that Tasks could simply establish a service watch directly on the mesos-dns-agent, which establishes a single connection to a "master" mesos-dns instance (meaning, it's a top-level instance that's consuming an event stream directly from mesos master).

I'm really not convinced that the stateful store of a consul kv-backed solution is the right answer. It's yet another consensus system on top of a cluster that already requires at least two consensus systems to function (ZK and Mesos). Proceed with a high degree of caution here.

On Thu, Sep 24, 2015 at 6:25 AM, Tomás Senart notifications@github.com wrote:

While I understand the use case, I have a few remarks regarding this idea in the context of Mesos-DNS:

1.

Due to lack of better options, Mesos-DNS currently needs to periodically poll the state of the Mesos master. This inherently delays state propagation. 2.

With TTL=0 the cost of an extra-round trip to the DNS server is still incurred at request time but it effectively eliminates staleness, just like watches. 3.

Mesos-DNS is a DNS server and I'd like to keep it that way. Adding watch semantics to a DNS server over HTTP is smelling like feature creep. If we want a way to push Mesos state changes downstream, we should build that separately. In addition, at least in DCOS context, we're evaluating Consul for service discovery which, if it moves forward, would effectively provide this functionality with its Raft backed K/V store, not their DNS server component.

— Reply to this email directly or view it on GitHub https://github.com/mesosphere/mesos-dns/issues/283#issuecomment-142884194 .

tsenart commented 8 years ago

It seems my email reply didn't get here earlier today. Replying inline:

Some thoughts:

R(1): The delayed state propagation is a problem for DNS and HTTP/watch. This is an orthogonal issue, should be resolvable once mesos master supports event streams for non-framework clients (WIP). The nice thing about watch semantics is that as soon as state propagation happens, service record updates could be immediately sent downstream (vs. client polling (poor scale) or client caching (TTL)).

I understand the benefits. I was just noting that this issue affects this proposal.

R(2): TTL=0 is fine except for (a) broken DNS clients that ignore TTL, and; (b) when your cluster is very large and clients are constantly querying the DNS system because TTL=0 and they can't effectively cache record lookups at all. watch semantics don't have this particular problem: TTL's can be non-zero.

With watch semantics, do we have TTLs are all? That's a DNS concept, no?

R(3): Mesos-DNS implements an HTTP API (currently) for a reason: not all clients speak SRV but they do want access to service records. Adding support for 'watch'-style semantics isn't a huge leap to make, doesn't add a new protocol to mesos-dns (HTTP already supported), and from an API perspective could be an extension to the existing endpoints (these could accept a watch=true parameter if they wanted streaming updates).

Mesos-DNS exposes an HTTP interface to the current Mesos "service registry" which at the moment is completely coupled with this DNS server. Architecturally, this is undesirable and ought to be decoupled. I'd be reticent to add this kind of features before that happens.

More over, I think that it's reasonable to imagine a world wherein each slave node has a mesos-dns-agent that coalesces watches so that Tasks could simply establish a service watch directly on the mesos-dns-agent, which establishes a single connection to a "master" mesos-dns instance (meaning, it's a top-level instance that's consuming an event stream directly from mesos master).

Totally agree that a per-node agent architecture (a la Consul) is a very solid direction. That's one of the reasons that led me to evaluate Consul, which I'm currently doing. There are a number of challenges that rise in such scenarios, the first of which, scalable state propagation, is very elegantly solved with constant load Gossip protocols.

I'm really not convinced that the stateful store of a consul kv-backed solution is the right answer. It's yet another consensus system on top of a cluster that already requires at least two consensus systems to function (ZK and Mesos). Proceed with a high degree of caution here.

I'm not convinced either (yet), but not for that reason. After evaluation, our findings will be well synthesised and made available for consumption.

With that said, Mesos-DNS is currently in hardening, testing and general quality improvement phase, not architectural or functional evolution, at least until we ruled out the alternative solutions to service discovery in DCOS.

jdef commented 8 years ago

RE: mesos-dns project phase - understood.

My concerns re: consul stand. Gossip seems like overkill for this particular problem. Unless you have a complex distributed system like consul, and I really hope that we don't go down this path. It's such a heavy hammer and adds unnecessary complexity: it is not "doing one thing and doing it well". Let's make sure that we're thinking about huge scales here.

On Thu, Sep 24, 2015 at 12:17 PM, Tomás Senart notifications@github.com wrote:

It seems my email reply didn't get here earlier today. Replying inline:

Some thoughts:

R(1): The delayed state propagation is a problem for DNS and HTTP/watch. This is an orthogonal issue, should be resolvable once mesos master supports event streams for non-framework clients (WIP). The nice thing about watch semantics is that as soon as state propagation happens, service record updates could be immediately sent downstream (vs. client polling (poor scale) or client caching (TTL)).

I understand the benefits. I was just noting that this issue affects this proposal.

R(2): TTL=0 is fine except for (a) broken DNS clients that ignore TTL, and; (b) when your cluster is very large and clients are constantly querying the DNS system because TTL=0 and they can't effectively cache record lookups at all. watch semantics don't have this particular problem: TTL's can be non-zero.

With watch semantics, do we have TTLs are all? That's a DNS concept, no?

R(3): Mesos-DNS implements an HTTP API (currently) for a reason: not all clients speak SRV but they do want access to service records. Adding support for 'watch'-style semantics isn't a huge leap to make, doesn't add a new protocol to mesos-dns (HTTP already supported), and from an API perspective could be an extension to the existing endpoints (these could accept a watch=true parameter if they wanted streaming updates).

Mesos-DNS exposes an HTTP interface to the current Mesos "service registry" which at the moment is completely coupled with this DNS server. Architecturally, this is undesirable and ought to be decoupled. I'd be reticent to add this kind of features before that happens.

More over, I think that it's reasonable to imagine a world wherein each slave node has a mesos-dns-agent that coalesces watches so that Tasks could simply establish a service watch directly on the mesos-dns-agent, which establishes a single connection to a "master" mesos-dns instance (meaning, it's a top-level instance that's consuming an event stream directly from mesos master).

Totally agree that a per-node agent architecture (a la Consul) is a very solid direction. That's one of the reasons that led me to evaluate Consul, which I'm currently doing. There are a number of challenges that rise in such scenarios, the first of which, scalable state propagation, is very elegantly solved with constant load Gossip protocols.

I'm really not convinced that the stateful store of a consul kv-backed solution is the right answer. It's yet another consensus system on top of a cluster that already requires at least two consensus systems to function (ZK and Mesos). Proceed with a high degree of caution here.

I'm not convinced either (yet), but not for that reason. After evaluation, our findings will be well synthesised and made available for consumption.

With that said, Mesos-DNS is currently in hardening, testing and general quality improvement phase, not architectural or functional evolution, at least until we ruled out the alternative solutions to service discovery in DCOS.

— Reply to this email directly or view it on GitHub https://github.com/mesosphere/mesos-dns/issues/283#issuecomment-142976897 .

tsenart commented 8 years ago

@jdef: We're weighting pros and cons. There's nothing set in stone.