Open screeley44 opened 8 years ago
@screeley44 do you get a list of IP addresses when you do a dig cassandra-peers.default.cluster.local
from a Cassandra container?
What's the output of oc get namespaces
? It's possible that Openshift uses a different namespace from default.cluster.local
.
@vyshane - I'm using the default namespace (also referred to as project for OSE):
[root@ose1 usr_configs]# oc get namespaces
NAME LABELS STATUS AGE
default
my dig is not returning the ipaddrs of the containers:
root@cassandra-vfujv:/etc# dig $PEER_DISCOVERY_DOMAIN
; <<>> DiG 9.9.5-9+deb8u3-Debian <<>> cassandra-peers.default.cluster.local. ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 32805 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0
;; QUESTION SECTION: ;cassandra-peers.default.cluster.local. IN A
;; AUTHORITY SECTION: cluster.local. 60 IN SOA ns.dns.cluster.local. hostmaster.cluster.local. 1449158400 28800 7200 604800 60
;; Query time: 1 msec ;; SERVER: 192.168.122.251#53(192.168.122.251) ;; WHEN: Thu Dec 03 16:49:22 UTC 2015 ;; MSG SIZE rcvd: 109
root@cassandra-vfujv:/etc# dig cassandra-peers
; <<>> DiG 9.9.5-9+deb8u3-Debian <<>> cassandra-peers ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 8607 ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; WARNING: recursion requested but not available
;; QUESTION SECTION: ;cassandra-peers. IN A
;; Query time: 0 msec ;; SERVER: 192.168.122.251#53(192.168.122.251) ;; WHEN: Thu Dec 03 16:50:23 UTC 2015 ;; MSG SIZE rcvd: 33
The services (cassandra-peers and cassandra-service) look good to me based on get services and get endpoints:
[root@ose1 cassandra-custom]# oc get services
NAME CLUSTER_IP EXTERNAL_IP PORT(S) SELECTOR AGE
cassandra-peers None
It looks like DNS is not working for services. Is the DNS addon enabled for the Kubernetes cluster?
Hello, I am having a similar problem. DNS seems to respond, but the nodes are not joining:
root@cassandra-c7wdb:/# dig $PEER_DISCOVERY_DOMAIN
; <<>> DiG 9.9.5-9+deb8u6-Debian <<>> cassandra-peers.default.svc.cluster.local
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63289
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;cassandra-peers.default.svc.cluster.local. IN A
;; ANSWER SECTION:
cassandra-peers.default.svc.cluster.local. 30 IN A 10.244.0.4
cassandra-peers.default.svc.cluster.local. 30 IN A 10.244.0.3
;; Query time: 1 msec
;; SERVER: 10.0.0.10#53(10.0.0.10)
;; WHEN: Wed May 04 17:32:47 UTC 2016
;; MSG SIZE rcvd: 91
root@cassandra-c7wdb:/# nodetool status
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.244.0.3 111.87 KB 256 100.0% 058c38c7-0f79-419c-b42a-a7c2f9782110 Kubernetes Cluster
Do you see any log errors when you tail the cassandra pods?
I think what happened was the DNS did not populate in time for the second node. I've got a cluster up now that has two nodes by waiting some time after the first node was up.
Have you used this setup very extensively @vyshane ?
@vyshane - hello, I'm experimenting with your examples, everything seems to run fine, I have an openshift 3.1 cluster running master + 1 node and gluster cluster on the backend for Persistent Volume support.
I created peer-service, service and rc and my pods run, and I'm using a glusterfs volume for data persistence, the data is persisted on multiple restarts of the pods/rc but when I scale I'm not seeing the pods join the C* ring - and not sure what I'm missing. I don't have a ton of experience with k8 or cassandra but from each container I can ping cassandra-peer (peer service)- so I know they are able to connect.
Unclear to me right now if I need to change my PEER_DISCOVERY_DOMAIN or something else?
some output from oc (kubectl for openshift): [root@ose1 cassandra-custom]# oc get pods NAME READY STATUS RESTARTS AGE cassandra-vfujv 1/1 Running 0 59s cassandra-x36ay 1/1 Running 0 1m [root@ose1 cassandra-custom]# oc exec -it cassandra-x36ay -- nodetool status testspace
Datacenter: datacenter1
Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.1.0.32 176.43 KB 256 100.0% 03b19bd1-ce65-4525-89e7-b23c9b3f0a92 rack1