hashicorp / consul-k8s

First-class support for Consul Service Mesh on Kubernetes
https://www.consul.io/docs/k8s
Mozilla Public License 2.0
667 stars 316 forks source link

consul-sync multi k8s cluster unstable #579

Closed kong62 closed 3 years ago

kong62 commented 3 years ago

Community Note


Overview of the Issue

Reproduction Steps

Logs

cluster 01:

2021-07-29T03:17:49.396Z [INFO]  to-consul/sink: registering services
2021-07-29T03:18:04.489Z [INFO]  to-consul/sink: registering services
2021-07-29T03:18:04.489Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster02 service-id=kubernetes-default-1b98f3842a91 service-consul-namespace=""
2021-07-29T03:18:04.496Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster02 service-id=kubernetes-default-a4e8454056c2 service-consul-namespace=""
2021-07-29T03:18:04.500Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster02 service-id=kubernetes-default-e5cce37d8007 service-consul-namespace=""
2021-07-29T03:18:19.592Z [INFO]  to-consul/sink: registering services
2021-07-29T03:18:34.664Z [INFO]  to-consul/sink: registering services
2021-07-29T03:18:34.664Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster02 service-id=kubernetes-default-1b98f3842a91 service-consul-namespace=""
2021-07-29T03:18:34.670Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster02 service-id=kubernetes-default-a4e8454056c2 service-consul-namespace=""
2021-07-29T03:18:34.675Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster02 service-id=kubernetes-default-e5cce37d8007 service-consul-namespace=""
2021-07-29T03:18:49.761Z [INFO]  to-consul/sink: registering services
2021-07-29T03:19:04.833Z [INFO]  to-consul/sink: registering services
2021-07-29T03:19:04.833Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster02 service-id=kubernetes-default-1b98f3842a91 service-consul-namespace=""
2021-07-29T03:19:04.838Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster02 service-id=kubernetes-default-a4e8454056c2 service-consul-namespace=""
2021-07-29T03:19:04.841Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster02 service-id=kubernetes-default-e5cce37d8007 service-consul-namespace=""
2021-07-29T03:19:19.924Z [INFO]  to-consul/sink: registering services
2021-07-29T03:19:35.005Z [INFO]  to-consul/sink: registering services
2021-07-29T03:19:35.005Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster02 service-id=kubernetes-default-1b98f3842a91 service-consul-namespace=""
2021-07-29T03:19:35.011Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster02 service-id=kubernetes-default-a4e8454056c2 service-consul-namespace=""
2021-07-29T03:19:35.015Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster02 service-id=kubernetes-default-e5cce37d8007 service-consul-namespace=""
2021-07-29T03:19:50.105Z [INFO]  to-consul/sink: registering services
2021-07-29T03:20:05.180Z [INFO]  to-consul/sink: registering services
2021-07-29T03:20:05.180Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster02 service-id=kubernetes-default-1b98f3842a91 service-consul-namespace=""
2021-07-29T03:20:05.190Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster02 service-id=kubernetes-default-a4e8454056c2 service-consul-namespace=""
2021-07-29T03:20:05.194Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster02 service-id=kubernetes-default-e5cce37d8007 service-consul-namespace=""

cluster 02:

2021-07-29T03:19:18.843Z [INFO]  to-consul/sink: registering services
2021-07-29T03:19:18.843Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster01 service-id=kubernetes-default-198b5a053404 service-consul-namespace=""
2021-07-29T03:19:18.853Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster01 service-id=kubernetes-default-5c94a96f78a1 service-consul-namespace=""
2021-07-29T03:19:18.861Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster01 service-id=kubernetes-default-fc3caffd7ddd service-consul-namespace=""
2021-07-29T03:19:33.897Z [INFO]  to-consul/sink: registering services
2021-07-29T03:19:33.897Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster01 service-id=kubernetes-default-198b5a053404 service-consul-namespace=""
2021-07-29T03:19:33.910Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster01 service-id=kubernetes-default-5c94a96f78a1 service-consul-namespace=""
2021-07-29T03:19:33.917Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster01 service-id=kubernetes-default-fc3caffd7ddd service-consul-namespace=""
2021-07-29T03:19:48.955Z [INFO]  to-consul/sink: registering services
2021-07-29T03:19:48.955Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster01 service-id=kubernetes-default-5c94a96f78a1 service-consul-namespace=""
2021-07-29T03:19:48.967Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster01 service-id=kubernetes-default-fc3caffd7ddd service-consul-namespace=""
2021-07-29T03:19:48.977Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster01 service-id=kubernetes-default-198b5a053404 service-consul-namespace=""
2021-07-29T03:20:04.010Z [INFO]  to-consul/sink: registering services
2021-07-29T03:20:04.011Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster01 service-id=kubernetes-default-5c94a96f78a1 service-consul-namespace=""
2021-07-29T03:20:04.020Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster01 service-id=kubernetes-default-fc3caffd7ddd service-consul-namespace=""
2021-07-29T03:20:04.027Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster01 service-id=kubernetes-default-198b5a053404 service-consul-namespace=""
2021-07-29T03:20:19.066Z [INFO]  to-consul/sink: registering services
2021-07-29T03:20:19.067Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster01 service-id=kubernetes-default-198b5a053404 service-consul-namespace=""
2021-07-29T03:20:19.077Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster01 service-id=kubernetes-default-5c94a96f78a1 service-consul-namespace=""
2021-07-29T03:20:19.084Z [INFO]  to-consul/sink: deregistering service: node-name=k8s-sync-cluster01 service-id=kubernetes-default-fc3caffd7ddd service-consul-namespace=""

Expected behavior

Environment details

k8s cluster 01 ----> consul-sync 1 ----
                                             |
                                              ----> consul
                                             |
k8s cluster 02 ----> consul-sync 2 ----

cluster01: helm deploy consul and consul-sync

Additional Context

service instances very unstable : image

image

image

this is my want:

image

ndhanushkodi commented 3 years ago

Hi @kong62, the workflow you describe:

k8s cluster 01 ----> consul-sync 1 ----
                                             |
                                              ----> consul
                                             |
k8s cluster 02 ----> consul-sync 2 ----

is unfortunately not supported at this time. The behaviour you are seeing is because each syncer is trying to delete the services that the other syncer has synced. Each syncer is programmed to keep a list of services it has synced, and will delete anything not synced by it.

If you would like services in multiple clusters synced to Consul, you could consider using Consul Service mesh, and using Federation between Kubernetes Clusters to have all of the services registered in Consul.

kong62 commented 3 years ago

@ndhanushkodi slaved, thanks

cluster01:

            -consul-node-name=k8s-sync-cluster01 \
            -consul-k8s-tag=cluster01 \

cluster02:

            -consul-node-name=k8s-sync-cluster02 \
            -consul-k8s-tag=cluster02 \

image

bondido commented 2 years ago

Hi, @ndhanushkodi - is that way - described by @kong62 - of differentiating k8s clusters by -consul-k8s-tag a supported way of syncing services from multiple k8s clusters to single consul datacenter? Can we use it and be sure it won't stop working (being "fixed" as a bug) in some future version?

If solving similar cases by appropriate tagging is not really taking advantage of a "bug" - could it be described in docs?

lkysow commented 2 years ago

Hi @bondido, yes that's the expected way. It won't be removed later. I can open up a ticket to document this better but if you'd like to submit a documentation PR yourself the page is here: https://github.com/hashicorp/consul/blob/main/website/content/docs/k8s/service-sync.mdx