Closed webmutation closed 2 years ago
Tried to workaround... change the nodeName on the second cluster the behaviour is still the same... also since there is no healthcheck on the node. the nodes become orphaned (k8s-sync-A).
@webmutation - the working way of handling your scenario is to differentiate services from every kubernetes cluster by tag in consul catalog. It's described here - https://github.com/hashicorp/consul-k8s/issues/579 - and confirmed that it's the expected method.
Thank you @bondido tested and indeed it is working! services are now staying registered.
However I wonder if there is a setting to remove orphan nodes after a timeout period... in other words how can we avoid having to manually remove the nodes? Is this possible? I was not able to find anything in the charts.
Hey @webmutation !! Consul does have a default setting for removing orphan node which is currently in the range of days. We do not expose this via the helm chart and I don't think we intend on doing so at the moment, unfortunately. We don't see this as a scenario users are expected to run into in a stable deployment.
Thanks for the message @thisisnotashwin it is clear now.
In our case, we have on-demand clusters that live only for a few hours or days, for UAT, Training events or Integration testing (specific versions of components being deployed)... I think we will have to write a script to manually remove it once the cluster is destroyed. It should not be a huge issue to handle. Thanks.
Community Note
Overview of the Issue
If two or more instances of consul-sync are running and pointing to the same Consul external service, all the services get unregistered and it goes into a loop of unregistering, registering services.
Reproduction Steps
global: enabled: false
client: enabled: false
externalServers: enabled: true hosts:
syncCatalog: enabled: true k8sDenyNamespaces: ["kube-system", "kube-public"]
And the unregistering of the services starts to occur... some services show up, then all services show up then all services disappear. And this loop goes on forever
Expected behavior
Services should not disappear, additional cluster connecting to Consul, should simply have their services registered. Services should not be unregistered. This is probably because the special k8s-sync node is being deleted or recreated...