aws / aws-app-mesh-controller-for-k8s

A controller to help manage App Mesh resources for a Kubernetes cluster.
Apache License 2.0
187 stars 110 forks source link

Cloud Map instances health UNKNOWN during service discovery #132

Closed rohithcj closed 4 years ago

rohithcj commented 4 years ago

When using Cloud Map service discovery in app mesh controller, instances are getting registered as with health "UNKNOWN" even if health check is configured in Virtual Node listeners:

Virtual node configuration:

apiVersion: appmesh.k8s.aws/v1beta1
kind: VirtualNode
metadata:
  name: nodejs-app
  namespace: appmesh-workshop-ns
spec:
  meshName: appmesh-workshop
  listeners:
    - portMapping:
        port: 3000
        protocol: "http"
      healthCheck:
        healthyThreshold: 3
        intervalMillis: 10000
        path: "/"
        port: 3000
        protocol: http
        timeoutMillis: 5000
        unhealthyThreshold: 3

  serviceDiscovery:
    cloudMap:
      namespaceName: appmeshworkshop.pvt.local
      serviceName: nodejs

Health Status in Cloud Map:

$ aws servicediscovery discover-instances --namespace appmeshworkshop.pvt.local --service-name nodejs| jq '.Instances[].HealthStatus'
"UNKNOWN"
"UNKNOWN"
"UNKNOWN"
astrived commented 4 years ago

@achevuru will work with @kiranmeduri

M00nF1sh commented 4 years ago

This is already address in our GA version. You can optionally specify a --enable-custom-health-check=true flag to our controller. In that case, it will create cloudMapService with healthCheck enabled.

Note: cloudMapService with healthCheck enabled incurs more cost, and is unnecessary overhead. (only use it if you are concerned about this UNKNOWN status)

mindbergh commented 2 years ago

@M00nF1sh would you please clarify why enabling healthcheck is unnecessary overhead?

Did you mean that although the status is unknown, healthCheck spec under the listeners is still used for registering/deregistering the instances in cloudmap? Or did you mean healthCheck spec under the listeners is completely unnecessary?

pygaissert commented 2 years ago

@M00nF1sh echoing @mindbergh 's question. My Virtual Gateway is failing to route requests to any of my Virtual Services, and I would like to rule out whether or not an "UNKNOWN" health status is contributing to this.