rabbitmq / rabbitmq-peer-discovery-k8s

Kubernetes-based peer discovery mechanism for RabbitMQ
Other
295 stars 94 forks source link

Peer discovery should not cross namespace boundary #34

Closed 007 closed 6 years ago

007 commented 6 years ago

I have multiple rabbitmq clusters deployed in a single k8s cluster using the k8s discovery backend, and they all end up trying to peer with each other.

Since each cluster has a separate Erlang cookie, they are unsuccessful in replicating, but the logs are littered with messages like:

2018-08-20 22:02:38.901 [error] <0.6449.0> ** Connection attempt from disallowed node 'rabbit@100.98.184.20' ** 

The k8s peering backend should limit to $self.namespace by default and allow an override, or should have an option to limit peering to its own namespace.

To repro, you should be able to build a cluster with the reference StatefulSet config, then change the namespace and Erlang cookie to deploy a second copy. Both sets of clusters will discover the others' peers, and will be continuously logging disallowed messages.

michaelklishin commented 6 years ago

I believe this is already possible and exposed in the config schema.

michaelklishin commented 6 years ago

I'm not sure why the namespace was chosen to be stored in a file since in theory is not a sensitive value (unlike, say, the token), that was inherited from rabbitmq-autocluster.

007 commented 6 years ago

You get /var/run/secrets/kubernetes.io/serviceaccount as a metadata file mount for free when you have a service token mounted to the pod. It includes at least token, namespace and ca.crt, so it's an easy way for any container within a pod to find $self.namespace without resorting to API acrobatics.

Erlang hurts my head to parse, but from what I can understand it looks like it's fetching /api/v1/namespaces/${namespace}/endpoints/${servicename} then mapping X = .subsets[].addresses[] and extracting X.${k8s_address_type}. Also: wow, the entire plugin is like 357 lines in 3 files. 👍 👍

So how is it possible that it ever "crossed the streams" and got another namespace's nodes? Yes, it shouldn't be possible, but I have log messages (and thankfully mismatched cookies) that say it happened.

Is there any race on startup that would use another discovery mechanism before loading the configs and trying k8s?