rabbitmq / rabbitmq-autocluster

RabbitMQ peer discovery and cluster formation plugin, supports RabbitMQ 3.6.x
BSD 3-Clause "New" or "Revised" License
242 stars 54 forks source link

Does`n work in k8s 1.6.3 #34

Closed tianctrl closed 7 years ago

tianctrl commented 7 years ago

Cluster status of node rabbit@172.17.249.218 ... [ {nodes,[{disc,['rabbit@172.17.249.218']}]} ,{running_nodes,['rabbit@172.17.249.218']} ,{cluster_name,<<"rabbit@rabbitmq-0.rabbitmq.test-rabbitmq.svc.cluster.local">>} ,{partitions,[]} ,{alarms,[{'rabbit@172.17.249.218',[]}]} ] Every node can`t find others

fatduo commented 7 years ago

The service name must be same as ENV K8S_SERVICE_NAME(default value 'rabbitmq')

tianctrl commented 7 years ago

I use all the default yamls, didn`t change anything

tianctrl commented 7 years ago

I deploy rabbitmq-autocluster in k8s 1.5.1, it works. But 1.6 didn`t work.

Does it support calico IP IP ?

k8s 1.5.1 used weave, and worked.

08:21:42.548 [info] Peer discovery backend rabbit_peer_discovery_classic_config does not support registration, skipping randomized startup delay. 08:21:42.549 [info] Peer discovery backend rabbit_peer_discovery_classic_config does not support registration, skipping registration.

kubectl logs these pods , show above info at the end. I don`t understand what these mean.

michaelklishin commented 7 years ago

Thank you for your time.

Team RabbitMQ uses GitHub issues for specific actionable items engineers can work on. This assumes two things:

  1. GitHub issues are not used for questions, investigations, root cause analysis, discussions of potential issues, etc (as defined by this team)
  2. We have a certain amount of information to work with

We get at least a dozen of questions through various venues every single day, often quite light on details. At that rate GitHub issues can very quickly turn into a something impossible to navigate and make sense of even for our team. Because of that questions, investigations, root cause analysis, discussions of potential features are all considered to be mailing list material by our team. Please post this to rabbitmq-users.

Getting all the details necessary to reproduce an issue, make a conclusion or even form a hypothesis about what's happening can take a fair amount of time. Our team is multiple orders of magnitude smaller than the RabbitMQ community. Please help others help you by providing a way to reproduce the behavior you're observing, or at least sharing as much relevant information as possible on the list:

Feel free to edit out hostnames and other potentially sensitive information.

When/if we have enough details and evidence we'd be happy to file a new issue.

Thank you.

michaelklishin commented 7 years ago

@tianctrl

Peer discovery backend rabbit_peer_discovery_classic_config does not support registration, skipping randomized startup delay

is a log message that this plugin or RabbitMQ 3.6.x releases do not produce. You are using master/3.7.0, intentionally or not, which is not even an RC yet. If you intend to use 3.7.0/master, I suggest that you use rabbitmq/rabbitmq-peer-discovery-k8s instead of this plugin.

michaelklishin commented 7 years ago

The message above also means that as far as RabbitMQ [master] is concerned, it is not configured to use Kubernetes for discovery. Instead it falls back to the "classic config" backend (which uses a fixed list from the config, rabbit.cluster_nodes, just like in 3.6.x).

tianctrl commented 7 years ago

Thank you for your time and help. I changed Rabbitmq version to 3.6.6. But, also couldn`t work fine. In the same host, the rabbitmq cluster worked, but could not work in 2 or more hosts. I use calico net.

BTW, I found a thing, pod STATUS show:

rpc error: code = 2 desc = failed to start container "8561d00c6f17de58a660bc60485c4538180d08708f398370b89c024eec7d473b": Error response from daemon: {"message":"oci runtime error: container_linux.go:247: starting container process caused \"exec: \\\"/launch.sh\\\": permission denied\"\n"}

and I add RUN chmod +x /launch.sh to Dockerfile , pods would create to Running status.

tianctrl commented 7 years ago

Its my fault, one of my vm host is unhealthy. but it looks nothing wrong. And I changed the cluster to bare metal k8s cluster, the rabbitmq cluster worked fine. Thank you for your help , its mean a lot to me :)