hazelcast / hazelcast-kubernetes

Kubernetes Discovery for Hazelcast
Apache License 2.0
173 stars 99 forks source link

Port in DnsEndpointResolver is 0 #8

Closed wombat closed 8 years ago

wombat commented 8 years ago

I just used this module and deployed two services on google cloud, added a headless service which returns service records for both services but the hazelcast connect fails because the DnsEndpointResolver tries to connect to port 0 - the service itself exposes port 5701.

Any thoughts?

wombat commented 8 years ago

@noctarius @saturnism any ideas?

amoAHCP commented 8 years ago

I can confirm this, same setup and same "port 0" instead of 5701 thing.

amoAHCP commented 8 years ago

Hi @noctarius @saturnism, I think I found the bug in the DnsEndpointResolver class and created a workaround: in line 68 you do: srv.getPort() ... but this always returns 0 so for the moment I add following: private int getHazelcastPort(int port){ if(port>0) return port; return NetworkConfig.DEFAULT_PORT; }

But that's not the final solution. If you have 5701 this port may change. To the correct way to do is to get the properties, get every possible port number and create for each host:portNumber tupel a single Address entry. Ho can I access the properties defined in the xml? Andy

wombat commented 8 years ago

I just did a plain dns srv query with dig in gce and got the following:

dig srv vertx-service.hazelcast-demo.svc.cluster.local         

; <<>> DiG 9.10.2 <<>> srv vertx-service.hazelcast-demo.svc.cluster.local
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22574
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 3

;; QUESTION SECTION:
;vertx-service.hazelcast-demo.svc.cluster.local. IN SRV

;; ANSWER SECTION:
vertx-service.hazelcast-demo.svc.cluster.local. 30 IN SRV 10 33 0 92ffb3e.vertx-service.hazelcast-demo.svc.cluster.local.
vertx-service.hazelcast-demo.svc.cluster.local. 30 IN SRV 10 33 0 abdc364a.vertx-service.hazelcast-demo.svc.cluster.local.
vertx-service.hazelcast-demo.svc.cluster.local. 30 IN SRV 10 33 0 423149b9.vertx-service.hazelcast-demo.svc.cluster.local.

;; ADDITIONAL SECTION:
92ffb3e.vertx-service.hazelcast-demo.svc.cluster.local. 30 IN A 10.0.1.7
abdc364a.vertx-service.hazelcast-demo.svc.cluster.local. 30 IN A 10.0.0.6
423149b9.vertx-service.hazelcast-demo.svc.cluster.local. 30 IN A 10.0.1.4

;; Query time: 1 msec
;; SERVER: 10.3.240.10#53(10.3.240.10)
;; WHEN: Tue Apr 19 11:36:44 UTC 2016
;; MSG SIZE  rcvd: 374

What I can see from this is the fact that the srv records themselves contain a 0 port. So its a gce or kubernetes bug i guess?

saturnism commented 8 years ago

Hiya, quick question, how was the service defined? Can you attach the yaml file? Thanks! On Tue, Apr 19, 2016 at 1:38 PM Daniel Sachse notifications@github.com wrote:

I just did a plain dns srv query with dig in gce and got the following:

dig srv vertx-service.hazelcast-demo.svc.cluster.local

; <<>> DiG 9.10.2 <<>> srv vertx-service.hazelcast-demo.svc.cluster.local ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22574 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 3

;; QUESTION SECTION: ;vertx-service.hazelcast-demo.svc.cluster.local. IN SRV

;; ANSWER SECTION: vertx-service.hazelcast-demo.svc.cluster.local. 30 IN SRV 10 33 0 92ffb3e.vertx-service.hazelcast-demo.svc.cluster.local. vertx-service.hazelcast-demo.svc.cluster.local. 30 IN SRV 10 33 0 abdc364a.vertx-service.hazelcast-demo.svc.cluster.local. vertx-service.hazelcast-demo.svc.cluster.local. 30 IN SRV 10 33 0 423149b9.vertx-service.hazelcast-demo.svc.cluster.local.

;; ADDITIONAL SECTION: 92ffb3e.vertx-service.hazelcast-demo.svc.cluster.local. 30 IN A 10.0.1.7 abdc364a.vertx-service.hazelcast-demo.svc.cluster.local. 30 IN A 10.0.0.6 423149b9.vertx-service.hazelcast-demo.svc.cluster.local. 30 IN A 10.0.1.4

;; Query time: 1 msec ;; SERVER: 10.3.240.10#53(10.3.240.10) ;; WHEN: Tue Apr 19 11:36:44 UTC 2016 ;; MSG SIZE rcvd: 374

What I can see from this is the fact that the srv records themselves contain a 0 port. So its a gce or kubernetes bug i guess?

— You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub https://github.com/noctarius/hazelcast-kubernetes-discovery/issues/8#issuecomment-211871185

amoAHCP commented 8 years ago

Hi Ray, nice to see you ;-) (JavaLand Hackergarten..?). Here is my service definition...

apiVersion: v1 kind: Service metadata: labels: name: frontend-verticle-dns visualize: "true" name: frontend-verticle-dns spec: type: LoadBalancer ports:

Andy

saturnism commented 8 years ago

Yup! :)

I haven't had a chance to test out your config but I wonder if multiple ports service is causing the issue. On Wed, Apr 20, 2016 at 8:26 AM Andy Moncsek notifications@github.com wrote:

Hi Ray, nice to see you ;-) (JavaLand Hackergarten..?). Here is my service definition...

apiVersion: v1 kind: Service metadata: labels: name: frontend-verticle-dns visualize: "true" name: frontend-verticle-dns spec: type: LoadBalancer ports:

  • port: 80 targetPort: 8181 name: frontend-verticle-dns
  • port: 5701 name: hazelcast selector: name: frontend-verticle-dns

Andy

— You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub https://github.com/noctarius/hazelcast-kubernetes-discovery/issues/8#issuecomment-212279172

wombat commented 8 years ago

@saturnism I don´t think that multiple ports cause the issue as my service looks like this:

apiVersion: v1
kind: Service
metadata:
  labels:
    name: vertx-service
  name: vertx-service
spec:
  clusterIP: None
  ports:
    - port: 5701
      targetPort: 5701
  selector:
    technology: vertx
amoAHCP commented 8 years ago

Hi, expose multiple ports should work since v1beta3. The rest of my service looks like w0mbat's example (see: https://github.com/amoAHCP/kube_vertx_demo/tree/dns-resolving).

Andy

noctarius commented 8 years ago

But port discovery won't work with DNS lookups as DNS doesn't know about ports. You should use the service discovery (REST API) lookup instead. Please only select servicename and namespace to activate that type of lookup.

wombat commented 8 years ago

@noctarius Of course DNS knows about ports - thats what SRV records were made for!

@amoAHCP In between I found the solution - The query itself for vertx-service.hazelcast-demo.svc.cluster.local was wrong.

I had to use the following K8s Service definition (port name was the important part):

apiVersion: v1
kind: Service
metadata:
  labels:
    name: vertx-service
  name: vertx-service
spec:
  clusterIP: None
  ports:
    - name: hazelcast
      port: 5701
      targetPort: 5701
  selector:
    technology: vertx

After that, I was able to use the following DNS query: dig srv _hazelcast._tcp.vertx-service.hazelcast-demo.svc.cluster.local

By it is also documented here: Kubernetes also supports DNS SRV (service) records for named ports. If the "my-service.my-ns" Service has a port named "http" with protocol TCP, you can do a DNS SRV query for "_http._tcp.my-service.my-ns" to discover the port number for "http".

noctarius commented 8 years ago

Oh I see the srv records know the ports. Sorry wasn't aware of that, great to know :) 👍

wombat commented 8 years ago

@noctarius Yes, they are really great - use it a lot by now with varying technologies. BTW: Thanks again to @thockin for pointing me to the solution!