Open tjhiggins opened 3 years ago
Hi @tjhiggins, what version of Consul and Consul-k8s are you using?
The latest version of consul-k8s (0.24.0) contains a new cleanup controller which I believe may help address this issue.
Connect: add new cleanup controller that runs in the connect-inject deployment. This controller cleans up Consul service instances that remain registered despite their pods being deleted. This could happen if the pod's preStop hook failed to execute for some reason. [GH-433]
@blake Thanks for the quick response. I saw that - which is awesome, but I feel like this should be core functionality for non-k8s use-cases.
We unfortunately cannot use the connect inject controller because we need to support exposing multiple ports. So we have custom terraform that creates multiple envoy proxies etc. We are also planning on doing something similar for ECS and wouldn't have access to a cleanup controller.
Edit: My workaround at the moment is to attach the "Proxy Public Listener" check to both services, but that isn't ideal.
Overview of the Issue
Running consul in a k8s cluster. Sometimes the sidecar deregister fails to run when a pod gets deleted. The sidecar gets removed after the deregister_critical_service_after timeout, but the original service remains.
Reproduction Steps
Steps to reproduce this issue, eg:
Proposal
Option 1: Remove the service when its sidecar is deregistered
Option 2: Allow for an alias_service check to the service sidecar
I tried the following (Proxy Alias Check), but it says the service sidecar does not exist on the node: