hashicorp / consul-k8s

First-class support for Consul Service Mesh on Kubernetes
https://www.consul.io/docs/k8s
Mozilla Public License 2.0
667 stars 316 forks source link

Sync-Catalog: Configure health-check #29

Closed sagarj8n closed 1 year ago

sagarj8n commented 5 years ago

How do you configure the health check for the service that is automatically registered using the Consul sync Catalog.?

madsonic commented 5 years ago

strangely even the demo shows a dash for consul-ui https://www.hashicorp.com/blog/consul-and-kubernetes-service-catalog-sync I have some services that automatically shows passing health check while some showing merely a dash (as per the demo). Is there anything to do in particular to show health checks for synced services in kubernetes?

arnodenuijl commented 5 years ago

Just to chime in. I have a case where i need a health check here too.

We are running a setup with consul and fabio (https://fabiolb.net/). Fabio is a reverse proxy that automatically (via consul tags) registers consul servers in the reverse proxy.

I have consul-k8s setup via helm and when i deploy a service in kubernetes I see the service appearing in consul including the 'urlprefix-' tag (that fabio uses) that I added. The only problem is that fabio only registers services that have a health check. So since consul-k8s doesn't add an explicit health check the service doesn't appear in the reverse proxy.

---
apiVersion: v1
kind: Service
metadata:
  name: mynginx
  annotations:
    consul.hashicorp.com/service-tags: "urlprefix-/mynginx"
    consul.hashicorp.com/service-sync: "true"
    consul.hashicorp.com/service-name: "mynginx"
  labels:
    run: mynginx
spec:
  ports:
  - port: 80
    protocol: TCP
  selector:
    run: mynginx
tasdikrahman commented 5 years ago

@sagarj8n @madsonic @arnodenuijl did you folks figure it out? Landed here on a google search for an answer for the same.

arnodenuijl commented 5 years ago

No. I must say I didn't spend a lot of time on the issue anymore. But I didn't see a way to get this working.

jensoncs commented 5 years ago

@adilyse @mitchellh Can you confirm whether Consul sync catalog is providing any health check or not? If not is there any plan to add the functionality?

yarinm commented 5 years ago

+1 for this I also encountered a scenario that I'd like to have a health check defined in consul. I might have to resort to manually sync the service and define the health checks. But this should be solved in catalog sync.

dfang commented 5 years ago

k8s have health check,maybe catalog sync should sync status from that

gmachine1 commented 5 years ago

I'm looking for this too looks like it's not there, so instead will manually add the desired health checks.

gmachine1 commented 5 years ago

in particular what would be ideal in my case is add some annotations such as

"consul.hashicorp.com/service-meta-check_tcp": "true"                                              
"consul.hashicorp.com/service-meta-check_interval": "5s"

(as was done when using registrator and docker-compose) and automatically a tcp check is added

mstrYoda commented 4 years ago

Any update here? We are now using consul catalog sync to sync service from k8s to consul. We want to add health checks to consul for services. I want to add such a feature adds health check via custom annotations like @gmachine1 said. It seems it is enough to check tcp connection to :.

Any thoughts @lkysow ?

KalenWessel commented 4 years ago

Hoping for an update on this? I'm starting to use connect-inject which does a lot of things behind the scenes via annotations but still doesn't support service health checks which seems pretty important.

mstrYoda commented 4 years ago

I am volunteer to make a pr for this. We are using consul catalog snyc and we need to add another health check layer to our application. Instead consul catalog can handle this.

We are registering services from k8s and vms to consul. In some cases, we need to health check that ip addresses. I will update my comment when I remember the exact case.

david-yu commented 4 years ago

Hi there, Consul PM here. It sounds like there are two distinct ideas on how one would want to consume Healthchecks. One is to configure an actual Consul Health check on Kubernetes, while another is to sync a Kubernetes health check with Consul to help direct traffic to healthy service instances.

We are currently leaning towards the latter since it aligns better with Kubernetes design and would not require explicit annotations to define any health checks. I'm curious to know whether those that asked for native Consul health checks whether defining those health checks as Kubernetes readiness probes would be a workable solution, which Consul Kubernetes can then pick up to direct traffic to healthy service instances.

filintod commented 4 years ago

@david-yu I also want the latter (sync kubernetes health checks with Consul).

Thanks

JasonGilholme commented 4 years ago

Yes, the latter please!

raypettersen commented 3 years ago

@david-yu what's the status on this feature?

phr0zenx commented 3 years ago

@david-yu Is there a way I can implement health checks via k8s to check on the services via consul? I want to implement pod regenerative triggers that k8s is able to execute. Unless consul can do that?

lkysow commented 3 years ago

Hi Folks, I'm curious what types of services you're syncing to Consul.

NodePort and LoadBalancer services don't necessarily make sense to have their health synced to Consul because when the request lands at the node port or load balancer, Kubernetes handles routing to only healthy pods.

If you're syncing ClusterIP services, then the individual pod ips are synced to Consul and in this case it does make sense to sync the readiness status of each pod.

ak08743 commented 3 years ago

Hi @david-yu

Are there any updates on this issue ?

ak08743 commented 3 years ago

@ndhanushkodi Thank you for a quick reply.

The document that you linked is about service mesh and Consul cluster installed in k8s environment, unless I am misunderstanding something. This issue, and the problem we are experiencing now, is that we have external Consul cluster, and use ServiceSync to expose selected k8s services into that external Consul cluster for 3rd-party consumption. Reading through the linked document, I do not see how it would help in this case.

david-yu commented 3 years ago

Hi @ak08743 and folks following along. This issue is still outstanding. We initially had thought we could address this as part of a larger effort to build in health check support for Consul on Kubernetes, however we had to move this to a future release due to scope.

As pointed out by @lkysow could folks chime in on what kind of Services you are syncing with Consul K8s and how you expect the health to be represented if one of multiple pods for a service goes down? As an example in the case of LoadBalancer and NodePort services if one of the pods is unhealthy, the service may still be considered healthy since K8s would automatically route traffic to health pods.

ruaridhangus-fbr commented 3 years ago

In our case we have an inter-cluster issue EKS and Bare Metal with 2 different types of service however one type is being removed as its legacy.

1) EKS service is ClusterIP synced to Consul (AWS ENI IP address of the pod to be clear) which is then accessible to the Bare Metal cluster. 2) NodePort service being synced to Consul with hardcoded port (this is a legacy hangover to us and being deprecated so not the end of the world if pod is not healthchecked).

In regards to what is healthy for us is if there is 1 or more healthy pod in the service then the service can be classed as healthy but the individual pod is unhealthy and shouldn't be included in DNS lookup.

ak08743 commented 3 years ago

In our use case, we expose only LoadBalancer services outside the k8s cluster, so as long as there is something, alive and healthy, that can process the request - the service as a whole should be marked as healthy.

webmutation commented 2 years ago

Just came across this issue, this is a very significant problem for what we want to achieve.

I have the same use case as @arnodenuijl FabioLB is used as the RP to our services. Our use case is somewhat complex. We currently have a manual Reverse Proxy Mapping tool that we are trying to replace with Consul+FabioLB.

We can sync the service, but Fabio does not create a route if there is no health check... we have a mix of VM and K8S services, the VM services all register properly and can be exposed, the K8S using Consul Sync mechanism fail to create a route due to this constrain from Fabio... currently looking for workarounds for this.

We have the same acceptance parameter, if the service has one healthy pod behind it then it is healthy.

All our services have Liveness, Readiness and Startup Probes setup... I also noticed that when the number of instances is zero the service is removed from Consul (this is a desirable effect).

webmutation commented 2 years ago

@david-yu @lkysow could you provide some input on this topic? Sorry for the long post, but this issue is critical for us.

Node health check for consul-sync-catalog service seems to not be possible to bypass the health check issue. Since there is no actual agent, so no agent ID.

{
   "Node":{
      "ID":"",
      "Node":"k8s-sync",
      "Address":"127.0.0.1",
      "Datacenter":"dc1",
      "TaggedAddresses":null,
      "Meta":{
         "external-source":"kubernetes"
      },
      "CreateIndex":63620,
      "ModifyIndex":63620
   },
   "Services":{
      "another-nginx-service-nodeport-dev-ee37527b1e04":{
         "ID":"another-nginx-service-nodeport-dev-ee37527b1e04",
         "Service":"another-nginx-service-nodeport-dev",
         "Tags":[
            "k8s",
            "nginx",
            "dev",
            "urlprefix-/dev"
         ],
         "Address":"1.1.1.1",
         "Meta":{
            "external-k8s-ns":"dev",
            "external-source":"kubernetes",
            "port-":"80"
         },
         "Port":30877,
         "Weights":{
            "Passing":1,
            "Warning":1
         },
         "EnableTagOverride":false,
         "Proxy":{
            "Mode":"",
            "MeshGateway":{

            },
            "Expose":{

            }
         },
         "Connect":{

         },
         "CreateIndex":101871,
         "ModifyIndex":101871
      }
   }
}

Service health check using a check-definition.json in the consul config directory did not work, returns

Error reloading: Unexpected response code: 500 (Failed reloading checks: Failed to register check 'Check K8S Health': ServiceID "another-nginx-service-nodeport-cc-dev" does not exist &{K8S_Check Check K8S Health another-nginx-service-nodeport-dev [] http://1.1.1.1/ map[] GET 30s false false 1s 0s 0 0 0s 4096 {}})

The same result happens trying to use the REST API.

Reading the documentation this appears to be because the K8S Synced services are not part of any real agent, they are considered as external services. This checks out, since using the API to call /v1/agent/services does not return any of the Kubernetes services... these are only accessible via /v1/catalog/services.

After registering an external service with a node health check, I conclude that the current registration mechanism for consul-sync-catalog does not seem to follow the same rules. Since no node check is performed when registering this special external k8s-sync node.

In the documentation i was able to find, this information regarding external nodes

Because the check is listed we know that there is an entry for it in the catalog, but because this node and service don't have a local agent to run the check the node's health will not be monitored.

Would something similar be able to be applied to the consul-sync-catalog service registration process. to work in the same way as an external service with health checks, so we could at least have node level check passing and be able to allow load balancer like FabioLB to create routes to the services?

Since in k8s if the service has no healthy pod behind it, the service disappears from the consul catalog, the use of Consul ESM would not be initially necessary, just the ability to register a check for the k8s-sync node. Although it may be interesting in the future if the ESM was deployed as a side-car to consul-sync-catalog and able to run http and tcp health checks.

webmutation commented 2 years ago

Just adding to the mix that a different consul sync project has added support for health checks https://github.com/lynes-io/kube-consul-register In this case they are using an annotation.

consul.register/pod.container.probe.liveness | true\|false | Use container Liveness probe for checks. Default is true. -- | -- | -- consul.register/pod.container.probe.readiness | true\|false | Use container Readiness probe for checks. Default is false

To have a working Consul+FabioLB setup it would be enough if the Virtual node k8s-sync had a health check. Then we could do routing based on that value. It only needs one passing health check node or service, not both.

david-yu commented 2 years ago

Hi @webmutation sorry for the delay here. We are open to PRs to address and can review them to address changes in Catalog Sync. Although we still do maintain Catalog sync, we have arrived at a state where we don't anticipate any more changes to the product since our focus has shifted to building a service mesh product that supersedes the original work that was built into Catalog Sync, by providing service discovery across both K8s and VM clusters through the mesh. We do support health checks through Service mesh btw. This may not be what you were hoping for, but I wanted to communicate our current stance and thinking on this issue.

Juandavi1 commented 2 years ago

in short, doesn't recommend using this tool ? healths management is critical and we can't use consul as a service mesh because we already use istio

david-yu commented 1 year ago

Closing out as this is addressed by https://github.com/hashicorp/consul-k8s/pull/1821. This will be released with Consul K8s 1.0.x and Consul K8s 1.1.x.