moby / libnetwork

networking for containers
Apache License 2.0
2.15k stars 879 forks source link

Implement SRV records for swarm services #1362

Open sanimej opened 8 years ago

sanimej commented 8 years ago

Implement SRV records for swarm services . libnetwork #1163 added changes in the DNS server to handle the SRV query. But the service DB update based on swarm service life cycle is not integrated yet.

sandekar commented 7 years ago

Is there a chance that Docker Swarm will publish SRV records? We could really use that extra host information in them.

Why do I ask ? In the company where I work, we are building an infrastructure consisted of Prometheus + various exporters deployed as global Swarm services (Node exporter, cAdvisor etc) to collect metrics about our hosts and containers. In the Prometheus configuration we scrape those exporters using DNS resolution to tasks.<service-name>.

But because Swarm publishes A/AAAA records we get only the IP addresses of exporter containers. If some exporter service is restarted on some host (or the host is restarted) or deployed to a new one, its IP address will be different and we cannot correctly consume the metric data by it.

This makes it difficult to render metric data by hosts and even more difficult to trace problems related to a specific host/service, because we got to trace by container IP address where the concrete exporter's service is deployed.

Prometheus supports SRV records and exposes a label called __meta_dns_name which could be used to relabel the metric's instance, making it possible to include the hostname in the scraped metrics.

johan-adriaans commented 6 years ago

+1 for this. My use case is using the new Haproxy 1.8.0 feature that automatically configures its servers based on SRV records:

DNS SRV records (Olivier Houchard) : in order to go a bit further with DNS resolution, SRV records were implemented. The address, port and weight attributes will be applied to servers. New servers are automatically added provided there are enough available templates, and servers which disappear are automatically removed from the farm. By combining server templates and SRV records, it is now trivial to perform service discovery.

https://www.mail-archive.com/haproxy@formilux.org/msg28004.html

xenoterracide commented 6 years ago

+1, would want this regardless of swarm, just basic network

blop commented 6 years ago

More information about potential use with HAProxy here : https://www.haproxy.com/blog/dns-service-discovery-haproxy/

ddebroy commented 6 years ago

I wanted to clarify something about the HAProxy use case mentioned by @blop and @johan-adriaans: are you launching services using swarm in a cluster but want to use HAProxy to perform the load balancing directly on the swarm task containers rather than go through swarm's IPVS based load balancing?

@xenoterracide I am not sure I understand what you are referring to as "just basic network"? My understanding of this issue is to implement support for DNS SRV queries for a swarm service (with the answer pointing to the service's task containers). Are you looking for SRV support against individual container names (launched directly using docker run)?

xenoterracide commented 6 years ago

"just basic network" e.g. if I'm running docker-compose (or run), no swarm, in this way I can simply run the same cluster locally.

Are you looking for SRV support against individual container names (launched directly using docker run)?

I could see name or alias, really specifically the dns alias is the best match for me, as I name that specifically so it's consistent regardless.

you launching services using swarm in a cluster but want to use HAProxy to perform the load balancing directly on the swarm task containers rather than go through swarm's IPVS based load balancing

also yes, but for other things in addition to load balancing, such as TLS termination, caching,http header addition/translation ) (I didn't even know swarm had a load balancer until 10s ago)

here's one of my compose configurations, I would expect whether in swarm or or compose, or if I started them with run, that master and slave, both which have EXPOSE 8080 that I would be able to run a query like _http._tcp.dex.master, this particulary haproxy will be responisble for both tls termination and load balancing, I have another though that needs to send updated headers, 'cause cloudfront. Of course I can see that that might require additional configuration as well, as SRV records can return different ports than their names. so you might have to be able to do so something like srv: - http:8080 in the config, or better ports: - 8080:8080:http although in this case I've no desire to actually expose the service on the public network so...

version: "3"
services:
  master:
    build: ./etc/docker/dex
    image: 927476265057.dkr.ecr.us-east-1.amazonaws.com/dex/dex:latest
    ports:
      - 8001:8000
      - 1098:1098
    volumes:
      - ./dex-ui/target:/mnt/ui
    environment:
      - JMX_PORT=1098
      - HIBERNATE_SEARCH_DEFAULT_WORKER_BACKEND=jgroupsMaster
    networks:
      cluster:
        aliases:
          - dex.master
  slave:
    build: ./etc/docker/dex
    image: 927476265057.dkr.ecr.us-east-1.amazonaws.com/dex/dex:latest
    ports:
      - 8002:8000
      - 1099:1099
    volumes:
      - ./dex-ui/target:/mnt/ui
    environment:
      - JMX_PORT=1099
      - HIBERNATE_SEARCH_DEFAULT_WORKER_BACKEND=jgroupsSlave
    networks:
      cluster:
        aliases:
          - dex.slave
  lb:
    build:
        context: etc/docker/load-balancer
        args:
        - CONFIG=haproxy.cfg
    ports:
    - 80:80
    - 443:443
    networks:
      cluster:
blop commented 6 years ago

@ddebroy Yes. We need L7 HTTP routing with advanced rules and processing that only a full HTTP proxy like HAProxy offers. We're currently using the https://github.com/docker/dockercloud-haproxy but it has some bugs and is not maintained anymore.

I'd be nice to have a native and lightweight integration so that the proxy can reconfigure itself when some services are scaled/started/stopped. HAProxy 1.8 introduces support for automatic reconfiguration using SRV records. As docker currently provides all the discovery through DNS that would be even better than the current method used by dockercloud which requires access to the docker API.

cjbottaro commented 6 years ago

@ddebroy

I wanted to clarify something about the HAProxy use case mentioned by @blop and @johan-adriaans: are you launching services using swarm in a cluster but want to use HAProxy to perform the load balancing directly on the swarm task containers rather than go through swarm's IPVS based load balancing?

Yes, for running multiple web apps on a single swarm cluster. The HAProxy container is running in vip mode, then routing to different apps via connected overlay networks.

nhh commented 6 years ago

This would be a very cool feature i am currently missing both on swarm and local docker-compose!

/push

burgoyn1 commented 6 years ago

Came across an issue where SRV records would be extremely helpful to solve my problem. Any idea when this will be implemented?

flaviostutz commented 6 years ago

+1 here! I am configuring a Ceph cluster to run exclusively on Swarm. It has a recent feature for searching for monitor nodes using SRV records and I did miss this here. I wish I could declare a label on a service (say, "srv-dns=ceph-mon") and Swarm would automatically add the service name to its internal DNS.

phoenix741 commented 6 years ago

+1 here

Would like to use several records to have readable metrics.

phifty commented 5 years ago

+1

It would be great to use endpoint_mode: vip, but also resolve the tasks associated with a service via dns.

osegarra commented 5 years ago

+1

I'd like to deploy an ETCD cluster that can be discovered using SRV records.

gboddin commented 5 years ago

+1 I'd like to use Swarm's own discovery and decrease complexity

GCSBOSS commented 4 years ago

We also need this to use with Prometheus, not only with Swam but with regular docker-compose.

chrisbecke commented 4 years ago

+1 seems like a logical thing. How are people even using etcd without this?

doctorpangloss commented 4 years ago

So is this dead?

UchihaYuki commented 3 years ago

+1 I need it with Prometheus as well~~

juanjo-vlc commented 3 years ago

I need HAProxy because I need to inject some http headers in the request before querying the backend (graylog). Also for fine-grained control of each backend usage. Graylog can notify HAProxy of its own state so it can fake its own death and prevent HAProxy for opening new connections.

Radiergummi commented 6 months ago

Another plus one here. I tried to setup a RabbitMQ cluster in Swarm, and service discovery was a major pain point. Having a way to add SRV records would alleviate this.