This completes the full service lookup by adding the missing A record lookup step:
Look up the SRV records for a given service
Look up the A records for each FQDN returned by the previous SRV query
Add the IP:port pairs to the list of discovered nodes
Previously, when that step was skipped, the FQDN (not the IP address) was being passed to new DiscoveryNode, which would cause an UnresolvedAddressException.
Here's what the process looks like if you were to do it manually on the command line:
$ dig @localhost -p 8600 elasticsearch-transport.service.consul SRV
;; ANSWER SECTION:
elasticsearch-transport.service.consul. 0 IN SRV 1 1 9301 machine.node.dc1.consul.
elasticsearch-transport.service.consul. 0 IN SRV 1 1 9300 machine.node.dc1.consul.
$ dig @localhost -p 8600 machine.node.dc1.consul. A
machine.node.dc1.consul. 0 IN A 1.2.3.4
Now the IP:port pairs can be constructed: 1.2.3.4:93011.2.3.4:9300
I tested this on my own machine with 2 ES processes on different ports by setting discovery.zen.ping.multicast.enabled: false and discovery.zen.minimum_master_nodes: 2, and watching the output of curl -s localhost:9200/_cat/nodes?h=ip,port,v,m switch from {"error":"MasterNotDiscoveredException[waited for [30s]]","status":503} to 127.0.1.1 9300 1.7.2 *\n127.0.1.1 9301 1.7.2 m after starting the second ES process.
This completes the full service lookup by adding the missing A record lookup step:
Previously, when that step was skipped, the FQDN (not the IP address) was being passed to
new DiscoveryNode
, which would cause anUnresolvedAddressException
.Here's what the process looks like if you were to do it manually on the command line:
Now the IP:port pairs can be constructed:
1.2.3.4:9301
1.2.3.4:9300
I tested this on my own machine with 2 ES processes on different ports by setting
discovery.zen.ping.multicast.enabled: false
anddiscovery.zen.minimum_master_nodes: 2
, and watching the output ofcurl -s localhost:9200/_cat/nodes?h=ip,port,v,m
switch from{"error":"MasterNotDiscoveredException[waited for [30s]]","status":503}
to127.0.1.1 9300 1.7.2 *\n127.0.1.1 9301 1.7.2 m
after starting the second ES process.Also, this should support multiple A records per FQDN. Although I don't know if it works, the cloud-aws plugin already does this.
Closes #4.
/cc @grantr