valeriansaliou / vigil

🚦 Microservices Status Page. Monitors a distributed infrastructure and sends alerts (Slack, SMS, etc.).
https://crates.io/crates/vigil-server
Mozilla Public License 2.0
1.72k stars 128 forks source link

Service Discovery #13

Open hampsterx opened 6 years ago

hampsterx commented 6 years ago

Would be great to have support for Service Discovery (DNS SRV records) such as used by consul, Eureka, etc.

replicas (type: array[string], allowed: TCP or HTTP URLs, default: empty) — Node replica URLs to be probed (only used if mode is poll)

At present I can only list containerized services I have exposed on the load balancer which is only a small subset of services that are running as the rest talk to each other directly. Also because they are proxied there is only one replica for Vigil even though there might be a half dozen instances running so having count as 1 is not ideal.

valeriansaliou commented 5 years ago

Hi! That sounds like a nice idea for an alternative way to list services. This can be implemented.

Can you provide examples of what Vigil would query over DNS, and how results would be formatted, and thus entries could be checked?

hampsterx commented 5 years ago

hi @valeriansaliou thanks for that.

docker run --net=host consul

this will give you running consul server with web (port 8500) and dns (8600)

http://localhost:8500 (web UI)

(note: out of the box only consul agent is reported as a service but you can add some for testing easily enough using https://www.consul.io/api/agent/service.html)

you can query for DNS SRV records. EG

dig @localhost -p 8600 SRV consul.service.consul

; <<>> DiG 9.11.3-1ubuntu1.2-Ubuntu <<>> @localhost -p 8600 SRV consul.service.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45285
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 3
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;consul.service.consul.     IN  SRV

;; ANSWER SECTION:
consul.service.consul.  0   IN  SRV 1 1 8300 tim-pc.node.dc1.consul.

;; ADDITIONAL SECTION:
tim-pc.node.dc1.consul. 0   IN  A   127.0.0.1
tim-pc.node.dc1.consul. 0   IN  TXT "consul-network-segment="

;; Query time: 0 msec
;; SERVER: 127.0.0.1#8600(127.0.0.1)
;; WHEN: Tue Nov 20 10:05:42 NZDT 2018
;; MSG SIZE  rcvd: 144

I guess you would need to have config to specify the DNS host/port (most likely it would be consul agent running on same server as Vigil).

A cherry on the top feature would be able to have a minHealthyCount.

hampsterx commented 5 years ago

Here is example for creating a service with two instances..

PUT http://localhost:8500/v1/catalog/register

    {

          "Node": "test1", 
          "Address": "127.0.0.1",
           "Service": {
            "Service": "test",
            "Address": "127.0.0.1",
             "Port": 5000
           }
    }

and again for "test2" (port: 5001)

then..

GET http://localhost:8500/v1/catalog/service/test

[
  {
    "ID": "",
    "Node": "test1",
    "Address": "127.0.0.1",
    "Datacenter": "dc1",
    "TaggedAddresses": null,
    "NodeMeta": null,
    "ServiceKind": "",
    "ServiceID": "test",
    "ServiceName": "test",
    "ServiceTags": [

    ],
    "ServiceAddress": "127.0.0.1",
    "ServiceMeta": {

    },
    "ServicePort": 5000,
    "ServiceEnableTagOverride": false,
    "ServiceProxyDestination": "",
    "ServiceConnect": {
      "Native": false,
      "Proxy": null
    },
    "CreateIndex": 74,
    "ModifyIndex": 74
  },
  {
    "ID": "",
    "Node": "test2",
    "Address": "127.0.0.1",
    "Datacenter": "dc1",
    "TaggedAddresses": null,
    "NodeMeta": null,
    "ServiceKind": "",
    "ServiceID": "test",
    "ServiceName": "test",
    "ServiceTags": [

    ],
    "ServiceAddress": "127.0.0.1",
    "ServiceMeta": {

    },
    "ServicePort": 5001,
    "ServiceEnableTagOverride": false,
    "ServiceProxyDestination": "",
    "ServiceConnect": {
      "Native": false,
      "Proxy": null
    },
    "CreateIndex": 75,
    "ModifyIndex": 75
  }
]

then from dig..

dig @localhost -p 8600 SRV test.service.consul

; <<>> DiG 9.11.3-1ubuntu1.2-Ubuntu <<>> @localhost -p 8600 SRV test.service.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12442
;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 3
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;test.service.consul.       IN  SRV

;; ANSWER SECTION:
test.service.consul.    0   IN  SRV 1 1 5001 test2.node.dc1.consul.
test.service.consul.    0   IN  SRV 1 1 5000 test1.node.dc1.consul.

;; ADDITIONAL SECTION:
test2.node.dc1.consul.  0   IN  A   127.0.0.1
test1.node.dc1.consul.  0   IN  A   127.0.0.1

;; Query time: 0 msec
;; SERVER: 127.0.0.1#8600(127.0.0.1)
;; WHEN: Tue Nov 20 11:19:19 NZDT 2018
;; MSG SIZE  rcvd: 162

IE we got two instances operating on:

127.0.0.1:5000
127.0.0.1:5001

  let me know if I can be of any further help or guinea pig :)

valeriansaliou commented 5 years ago

Thanks for the details, I'll use them when processing this :)

deissh commented 5 years ago

You can search for containers in the local machine or in a docker swarm by labels. For example as traefik. It will be amazing!