gliderlabs / logspout

Log routing for Docker container logs
MIT License
4.65k stars 679 forks source link

RouteManager not looking up DNS SRV records #270

Open mhamann opened 7 years ago

mhamann commented 7 years ago

According to the docs Logspout should look for a route address's DNS SRV records to figure out which ports to hit:

but you can also specify a name that resolves via DNS to one or more SRV records. That means this works great with Consul for service discovery.

This is not working in practice. Attempting to add a route using the example in the routesapi doc (which uses a format like loggregator.service.consul) simply results in an error: Bad route: missing port in address

Here's the actual API call I'm making:

curl -X POST -H "Content-Type: application/json" -H "Cache-Control: no-cache" -H -d '{
    "adapter": "syslog",
    "address": "logstash.service.consul"
}' "http://localhost:8000/routes"

And here's the response from dig (inside the container) showing valid DNS records being returned (including SRV):

dig srv logstash.service.consul

; <<>> DiG 9.10.4-P6 <<>> srv logstash.service.consul
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2353
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 3

;; QUESTION SECTION:
;logstash.service.consul. IN SRV

;; ANSWER SECTION:
logstash.service.consul. 0 IN   SRV 1 1 31942 10.173.144.84.node.default.consul.
logstash.service.consul. 0 IN   SRV 1 1 31008 10.173.144.128.node.default.consul.
logstash.service.consul. 0 IN   SRV 1 1 31501 10.173.144.115.node.default.consul.

;; ADDITIONAL SECTION:
10.173.144.84.node.default.consul. 0 IN A   10.173.144.84
10.173.144.128.node.default.consul. 0 IN A  10.173.144.128
10.173.144.115.node.default.consul. 0 IN A  10.173.144.115

;; Query time: 2 msec
;; SERVER: 172.17.0.1#53(172.17.0.1)
;; WHEN: Sat Feb 25 21:10:20 UTC 2017
;; MSG SIZE  rcvd: 224
mhamann commented 7 years ago

I added code to the TCP adapter's Dial method that looks like this:

import "github.com/benschw/srv-lb/lb"

func (_ *tcpTransport) Dial(addr string, options map[string]string) (net.Conn, error) {

    if v := strings.Split(addr, ":"); len(v) < 2 {
        cfg, err := lb.DefaultConfig()
        if err != nil {
            return nil, err
        }

        l := lb.New(cfg, addr)
        resolvAddr, err := l.Next()

        if err == nil {
            addr = resolvAddr.String()
        }
    }

    raddr, err := net.ResolveTCPAddr("tcp", addr)
    if err != nil {
        return nil, err
    }
    conn, err := net.DialTCP("tcp", nil, raddr)
    if err != nil {
        return nil, err
    }
    return conn, nil
}

Ideally, it seems like creating a common wrapper for Dial would be a nice place to put this code. Another option would be to put the DNS resolution code in its own function and just reference it from each transport.

Thoughts?

michaelshobbs commented 7 years ago

Yea seems like this doesn't work right now. Maybe that quote was aspirational at one point. In any case, I'd accept a patch that had the string parsing/resolution in their own functions and have the udp and tcp transports call out to them like you described. Let me know if you have time to put that together; with tests please. Otherwise I'll get to it at some point in the next few weeks.

mhamann commented 7 years ago

I have it done for TLS. I just need to clean it up a bit and apply to the other transports. Will try to get a PR in next week.