openshift / origin-metrics

79 stars 113 forks source link

hawkular-cassandra CrashLoopBackOff due to "Unable to gossip with any seeds" #74

Closed yepengxj closed 8 years ago

yepengxj commented 8 years ago

When I deploy origin-metrics in openshift-infra project hawkular-cassandra pod always CrashLoopBackOff due to "Unable to gossip with any seeds" . errlog: java.lang.RuntimeException: Unable to gossip with any seeds at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1328) at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:543) at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:754) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:688) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:580) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:292) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:488) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:595)

cassandra.yaml in container: seed_provider:

Addresses of hosts that are deemed contact points.

# Cassandra nodes use this list of hosts to find each other and learn
# the topology of the ring.  You must change this if you are running
# multiple nodes!
- class_name: org.hawkular.openshift.cassandra.OpenshiftSeedProvider
  parameters:
      # seeds is actually a comma-delimited list of addresses.
      # Ex: "<ip1>,<ip2>,<ip3>"
      - seeds: "hawkular-cassandra-1-mfzr2"

........ listen_address: hawkular-cassandra-1-mfzr2 ........

all nodes of origin cluster are in same security group of aws ec2 and the sg is allow all traffic in sg

mwringe commented 8 years ago

If you are building the images yourself, you may run into this problem currently (https://github.com/openshift/origin-metrics/pull/73). You can test if it is this problem by doing oc get service hawkular-cassandra-nodes and checking if the CLUSTER_IP is 'None'

Otherwise, can you please attach more of the Cassandra logs? This sometimes happens when the OpenShift DNS server does not startup properly and the Cassandra node cannot resolve hostnames. The logs will usually mention the IP address its trying to connect to or the just the hostname depending on the situation.

mwringe commented 8 years ago

Housekeeping to close older issues. If you think this issue is not resolved, please reopen it.