envoyproxy / gateway

Manages Envoy Proxy as a Standalone or Kubernetes-based Application Gateway
https://gateway.envoyproxy.io
Apache License 2.0
1.6k stars 349 forks source link

Unexpected Pod while deleting gateways with merge gateways enabled #2637

Open shawnh2 opened 8 months ago

shawnh2 commented 8 months ago

Description:

There's some unexpected Pod during this process for deleting gateways with mergeGateways features on.

Repro steps:

  1. Apply the manifest from quickstart
  2. Apply these config below, to create a GC named mg with MergeGateways feature on, and 3 GTWs with 3 HTTPRoutes attached to each GTW
kubectl apply -f mg.yaml ```yaml apiVersion: gateway.networking.k8s.io/v1 kind: GatewayClass metadata: name: mg spec: controllerName: gateway.envoyproxy.io/gatewayclass-controller parametersRef: group: gateway.envoyproxy.io kind: EnvoyProxy name: custom-proxy-config namespace: envoy-gateway-system --- apiVersion: gateway.envoyproxy.io/v1alpha1 kind: EnvoyProxy metadata: name: custom-proxy-config namespace: envoy-gateway-system spec: mergeGateways: true --- apiVersion: gateway.networking.k8s.io/v1 kind: Gateway metadata: name: merged-eg-1 namespace: default spec: gatewayClassName: mg listeners: - allowedRoutes: namespaces: from: Same name: http port: 8080 protocol: HTTP --- apiVersion: gateway.networking.k8s.io/v1 kind: Gateway metadata: name: merged-eg-2 namespace: default spec: gatewayClassName: mg listeners: - allowedRoutes: namespaces: from: Same name: http port: 8081 protocol: HTTP --- apiVersion: gateway.networking.k8s.io/v1 kind: Gateway metadata: name: merged-eg-3 namespace: default spec: gatewayClassName: mg listeners: - allowedRoutes: namespaces: from: Same name: http port: 8082 protocol: HTTP --- apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: hostname1-route spec: parentRefs: - name: merged-eg-1 hostnames: - "www.example.com" rules: - backendRefs: - group: "" kind: Service name: backend port: 3000 weight: 1 matches: - path: type: PathPrefix value: /example --- apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: hostname2-route spec: parentRefs: - name: merged-eg-2 hostnames: - "www.example2.com" rules: - backendRefs: - group: "" kind: Service name: backend port: 3000 weight: 1 matches: - path: type: PathPrefix value: /example2 --- apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: hostname3-route spec: parentRefs: - name: merged-eg-3 hostnames: - "www.example3.com" rules: - backendRefs: - group: "" kind: Service name: backend port: 3000 weight: 1 matches: - path: type: PathPrefix value: /example3 ```
  1. Everything is fine now
k get gc                                                             ✭ ✈
NAME            CONTROLLER                                      ACCEPTED   AGE
envoy-gateway   gateway.envoyproxy.io/gatewayclass-controller   True       65m
mg              gateway.envoyproxy.io/gatewayclass-controller   True       2s

k get gtw                                                            ✭ ✈
NAME          CLASS   ADDRESS          PROGRAMMED   AGE
merged-eg-1   mg      172.18.255.200   True         4s
merged-eg-2   mg      172.18.255.200   True         4s
merged-eg-3   mg      172.18.255.200   True         4s

k get po -n envoy-gateway-system                                      ✭ ✈
NAME                                 READY   STATUS    RESTARTS   AGE
envoy-gateway-6b8cdbfcdc-fbcb8       1/1     Running   0          66m  # <- focus on this
envoy-mg-e9949d90-6d8596c878-ff5rb   1/1     Running   0          14s
  1. k delete -f mg.yaml

  2. There're some unexpected Pod showing up

k get po -n envoy-gateway-system                                      ✭ ✈
NAME                                                  READY   STATUS        RESTARTS   AGE
envoy-default-merged-eg-2-c7655c02-dd7c5d5d7-wp62r    0/1     Terminating   0          2s  # <- unexpected
envoy-default-merged-eg-3-2a08f1cd-85dc9dbc6f-6wm7n   0/1     Terminating   0          2s  # <- unexpected
envoy-gateway-6b8cdbfcdc-fbcb8                        1/1     Running       0          66m

Environment:

lattest

shawnh2 commented 8 months ago

This problem will not exist if the Gateways and HTTPRoutes got deleted first, then delete the rest GatewayClass and EP.

If the GatewayClass and EP got deleted first, the merge gateways feature fails, so all 3 gateways will be created separately.

So by applying k delete -f mg.yaml, the GatewayClass and EP will be deleted first, causing all 3 gateways to be created separately and then got deleted immediately.

shawnh2 commented 8 months ago

Found the reason that causing this problem, and it does not seems like a bug to me, so closing this one now.

arkodg commented 8 months ago

hey @shawnh2 if the GWC gets deleted first, why will the GWs get created again ?

arkodg commented 8 months ago

maybe related to https://github.com/envoyproxy/gateway/pull/2659

shawnh2 commented 8 months ago

hey @shawnh2 if the GWC gets deleted first, why will the GWs get created again ?

by answering this question, i toke a closer look at EG reconcile method, and here is what i found.

  1. if the GWC got deleted first, it will still appears in acceptedGCs list, because it won't pass the finializer check: https://github.com/envoyproxy/gateway/blob/cf46fbe776918ad19444e26d637ffcc79676ca23/internal/provider/kubernetes/controller.go#L153-L154

  2. then EG will take the deleted GWC name as index to list the GTWs that associated with, we can still get all the GTWs here since they are not be deleted https://github.com/envoyproxy/gateway/blob/cf46fbe776918ad19444e26d637ffcc79676ca23/internal/provider/kubernetes/controller.go#L625-L626 so all the GTWs will be recreated like I described above

  3. we cannot fall into this logic to remove the finializer for GWC, so the GWC will never pass the finializer check in step 1, remain accepted https://github.com/envoyproxy/gateway/blob/cf46fbe776918ad19444e26d637ffcc79676ca23/internal/provider/kubernetes/controller.go#L347

shawnh2 commented 8 months ago

I'm not sure what expected behavior of this should be?

IMO, at least all the GTWs should be in NOT ACCEPT status and all the related resources like Service, Deployements etc should not be recreated.

cnvergence commented 8 months ago

It is also happening when applying resources, I have seen this in the past, is this only happening with merged gateways @shawnh2? This behaviour seems similar to what was fixed before https://github.com/envoyproxy/gateway/pull/2395 Since we changed the watchable interface to a map, which is unordered, wonder it is causing these unnecessary updates.

shawnh2 commented 8 months ago

It is also happening when applying resources, I have seen this in the past, is this only happening with merged gateways @shawnh2?

This behaviour seems similar to what was fixed before https://github.com/envoyproxy/gateway/pull/2395

Since we changed the watchable interface to a map, which is unordered, wonder what is causing these unnecessary updates.

yes, it only happens with merge gateways. it's mainly caused by the step 2 which I described above.

cnvergence commented 8 months ago

I see, this is matching the deletion issue

github-actions[bot] commented 6 months ago

This issue has been automatically marked as stale because it has not had activity in the last 30 days.

arkodg commented 5 months ago

@shawnh2 can this one be closed ?

shawnh2 commented 5 months ago

I think this behavior is a bug, will send a fix ASAP.

github-actions[bot] commented 4 months ago

This issue has been automatically marked as stale because it has not had activity in the last 30 days.

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had activity in the last 30 days.