zalando-stups / stups-etcd-cluster

Etcd cluster appliance for the STUPS (AWS) environment
Other
29 stars 9 forks source link

Support for indiviual A records (etcd-%d.example.org) #41

Open mikkeloscar opened 7 years ago

mikkeloscar commented 7 years ago

Currently there is support for service discovery through SRV records. This is somewhat working with the etcd2 proxy in Container Linux, but seem to be faced out with etcd3. There is support for it in the etcd3 gateway/proxy but there has been no updates for a long time and AFAIK it doesn't even work right now as it will generate a list of endpoints with the wrong format: [endpoint1.:2379,endpoint2.:2379]. It's an easy fix, but shows that no one is really using it.

A more common setup is to pass the application or etcd3 gateway a list of endpoints. So I'm wondering if this could be supported in the appliance?

Essentially it should just create individual A records:

etcd-0.example.org   10.0.0.1
etcd-1.example.org   10.0.0.2
etcd-2.example.org   10.0.0.3

Which could then be configured in the etcd3 gateway:

etcd gateway start \
  --listen-addr=127.0.0.1:2379 \
  --endpoints=etcd-0.example.org:2379,etcd-1.example.org:2379,etcd-2.example.org:2379

Is it as simple as just creating these records, or is there more to it?

/cc @CyberDem0n

CyberDem0n commented 7 years ago

I think it shouldn't be that hard to do it, https://github.com/zalando-incubator/stups-etcd-cluster/blob/master/etcd.py#L555 has all necessary information for that.

But the question is, how etcd gateway works. Does it resolve hostname into IP for every new connection or just once (at start)? This is pretty important, because after rolling upgrade all instances will change their IP's and therefore all A records would be updated.

mikkeloscar commented 7 years ago

This is pretty important, because after rolling upgrade all instances will change their IP's and therefore all A records would be updated.

Yup, this I don't know, will try to look into it.

hjacobs commented 7 years ago

@mikkeloscar maybe we can (instead of using etcd proxy) do the DNS discovery "manually" during node startup on Container Linux. That would be rather safe IMHO, get rid of one moving part (etcd proxy) and would not require any changes in the STUPS etcd cluster.

mikkeloscar commented 7 years ago

@hjacobs but we would still need to handle the case were the etcd cluster is updated.

Coreos uses the etcd3 gateway/proxy themselves fwiw.