emissary-ingress / emissary

open source Kubernetes-native API gateway for microservices built on the Envoy Proxy
https://www.getambassador.io
Apache License 2.0
4.32k stars 684 forks source link

problem when ambassador pods are created earlier than emissary-apiext pods #4164

Open wasparch opened 2 years ago

wasparch commented 2 years ago

The problem I observe is happening when K8s worker nodes on which emissary-apiext pods and ambassador pods are running are restarted (some crash happens), in that case when worker nodes are again running all deployments are starting at nearly the same time, and K8s scheduler and controllers won't guarantee any order of deployment starts - sometimes ambassador deployment starts earlier, sometimes emissary-apiext and when ambassador starts earlier then FilterPolicy, Filter, Module definitions (we are using these in our project) are not applied properly in the ambassador, because of such behavior our application is not working. In the FilterPolicy, Filter, Module definitions we are using apiVersion: getambassador.io/v2 (I also tried with v3alpha1 but the problem still exists).

Steps to reproduce the behavior:

  1. Scale to 0 k8s worker nodes (or terminate only nodes where emissary-apiext pods and ambassador pods are)
  2. Scale-up k8s worker nodes
  3. ambassador pods start before emissary-apiext pods
  4. FilterPolicy, Filter, Module definitions are not used by the ambassador

or easiest way (I know that in real life emissary-apiext should not be scaled to 0)

  1. Scale to 0 emissary-apiext deployment
  2. Restart ambassador deployment
  3. Scale-up emissary-apiext deployment
  4. FilterPolicy, Filter, Module definitions are not used by the ambassador

Expected behavior FilterPolicy, Filter, Module definitions are always properly used by the ambassador.

Versions (please complete the following information):

alexgervais commented 2 years ago

@wasparch indeed, it is a know limitation of the way Custom Resource Definition versioning works with Kubernetes and the API Server. In order to work, emissary pods need to query the v3alpha1 resources handled by emissary-apiext. No emissary-apiext, no custom resources. In the default setup, there should be at least 3 replicas of emissary-apiext, ensuring uptime.

How many nodes are running in your cluster, and how do you typically end up in this situation?

wasparch commented 2 years ago

@alexgervais In our EKS cluster, we have from two to three worker nodes, depending on the demand. Typically we end up in this situation when we switch off our worker nodes for the night (cost optimization) and the next day after all services will start we have this problem, but there are also situations when our worker nodes restart because of some failure and after restart this problem also occurs. In our emissary-apiext deployment number of replicas is set to 3. Additionally, what is the best liveness probe setting for the emissary-apiext? Because I also see that they restart a lot of times because of: "Liveness probe failed: Get "http://100.64.74.123:8080/probes/live": dial tcp 100.64.74.123:8080: connect: connection refused". Liveness probe configuration in our deployment: livenessProbe: httpGet: path: /probes/live port: 8080 scheme: HTTP initialDelaySeconds: 5 timeoutSeconds: 1 periodSeconds: 3 successThreshold: 1 failureThreshold: 3

One more question is there a strict connection between emissary pods and emissary-apiext pods that their pods need to be on the same worker nodes? Because I also had a situation when somehow apiext pods were deployed on one worker node and emissary on another one and even the restart of emissary pods didn't help.

I am also thinking that maybe in the emissary pods should be some init container, that checks if the conversion webhook provided by emissary-apiext is available or not and only if available start the emissary pods.

wasparch commented 2 years ago

@alexgervais any news?