Azure / application-gateway-kubernetes-ingress

This is an ingress controller that can be run on Azure Kubernetes Service (AKS) to allow an Azure Application Gateway to act as the ingress for an AKS cluster.
https://azure.github.io/application-gateway-kubernetes-ingress
MIT License
668 stars 415 forks source link

AGIC not update remaining configs when a failure happens #1541

Open ctmillerlin opened 1 year ago

ctmillerlin commented 1 year ago

Describe the bug When getting a protocol not matching error, AGIC will not update other remaining correct ingress configs. expected behavior: even if we get this error, I expect AGIC will still update the rest rules whose configs are correct. current behavior: when getting this error, AGIC doesn't update the rest rules. business impact: Applications could not be visited (Will get 502 AG error) during the issue happening.

To Reproduce Steps to reproduce the behavior:

  1. There are two ingresses: ingress A and ingress B. They are working well.
  2. Create a new ingress: ingress C with backend port 443 but not create related backend resources.
  3. Delete pod A which is the backend pod of ingress A. We assume the IP of old pod A is 10.2.0.60 and the IP of new pod A is 10.2.0.37.
  4. As expected, AGIC will update the backend pool in the AG with the new IP 10.2.0.37. But we will get unhealthy with a timeout error message.

See the screenshots below for more details.

image image image image image image image

Ingress Controller details


Namespace:        kube-system
Priority:         0
Service Account:  appgw-public-sa-ingress-azure
Node:             aks-syspool-xxx-xxx/10.2.4.113
Start Time:       Mon, 17 Apr 2023 11:54:46 +0800
Labels:           aadpodidbinding=appgw-public-ingress-azure
                  app=ingress-azure
                  pod-template-hash=f9b9bf464
                  release=appgw-public
Annotations:      checksum/config: 006decf114c41e0325d5c981bdd9b94456885d3943b5a6253b6c12d48787fd79
                  prometheus.io/port: 8123
                  prometheus.io/scrape: true
Status:           Running
IP:               10.2.4.128
IPs:
  IP:           10.2.4.128
Controlled By:  ReplicaSet/appgw-public-ingress-azure-f9b9bf464
Containers:
  ingress-azure:
    Container ID:   containerd://316fc4d66a5744f6a6836e702f32a63fac4c034f93400c8f895f132870e30f61
    Image:          mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.5.2
    Image ID:       mcr.microsoft.com/azure-application-gateway/kubernetes-ingress@sha256:31a876143de3aca583f0508c0eb0d2a69e1d3da21dba003ca0bdfc2434f807bd
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Mon, 17 Apr 2023 11:54:50 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     200m
      memory:  800Mi
    Requests:
      cpu:      100m
      memory:   100Mi
    Liveness:   http-get http://:8123/health/alive delay=15s timeout=1s period=20s #success=1 #failure=3
    Readiness:  http-get http://:8123/health/ready delay=5s timeout=1s period=10s #success=1 #failure=3
    Environment Variables from:
      appgw-public-cm-ingress-azure  ConfigMap  Optional: false
    Environment:
      AGIC_POD_NAMESPACE:             kube-system (v1:metadata.namespace)
      KUBERNETES_PORT_443_TCP_ADDR:   xxx.hcp.eastus.azmk8s.io
      KUBERNETES_PORT:                tcp://xxx.hcp.eastus.azmk8s.io:443
      KUBERNETES_PORT_443_TCP:        tcp://xxx.hcp.eastus.azmk8s.io:443
      KUBERNETES_SERVICE_HOST:        xxx.hcp.eastus.azmk8s.io
      AZURE_CLOUD_PROVIDER_LOCATION:  /etc/appgw/azure.json
      AGIC_POD_NAME:                  appgw-public-ingress-azure-f9b9bf464-p6lrs (v1:metadata.name)
    Mounts:
      /etc/appgw/ from azure (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-spztp (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  azure:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/kubernetes/
    HostPathType:  Directory
  kube-api-access-spztp:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              rfpool=syspool
Tolerations:                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                     Age                 From                       Message
  ----     ------                     ----                ----                       -------
  Warning  FailedApplyingAppGwConfig  35m (x12 over 40m)  azure/application-gateway  network.ApplicationGatewaysClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: Code="ApplicationGatewayProbeProtocolMustMatchBackendHttpSettinsProtocol" Message="Probe /subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/applicationGateways/xxxPublicGateway/probes/defaultprobe-Http protocol (Http) does not match Backend Http Setting /subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/applicationGateways/xxxPublicGateway/backendHttpSettingsCollection/bp-dev-consumer-marketing-site-443-443-marketing-site protocol (Https)." Details=[]
```yaml

* Output of `kubectl logs <ingress controller>.  
* 
`E0516 02:56:18.765339       1 context.go:366] Code="ErrorFetchingEndpoints" Message="Endpoint not found for dev-consumer/marketing-site"
E0516 02:56:18.765343       1 backendaddresspools.go:102] Code="ErrorFetchingEndpoints" Message="Endpoint not found for dev-consumer/marketing-site"
I0516 02:56:19.353076       1 mutate_app_gateway.go:177] BEGIN AppGateway deployment
I0516 02:56:19.850701       1 mutate_app_gateway.go:183] END AppGateway deployment
E0516 02:56:19.850817       1 controller.go:141] network.ApplicationGatewaysClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: Code="ApplicationGatewayProbeProtocolMustMatchBackendHttpSettinsProtocol" Message="Probe /subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/applicationGateways/xxxPublicGateway/probes/defaultprobe-Http protocol (Http) does not match Backend Http Setting /subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/applicationGateways/xxxPublicGateway/backendHttpSettingsCollection/bp-dev-consumer-marketing-site-443-443-marketing-site protocol (Https)." Details=[]
E0516 02:56:19.850832       1 worker.go:62] Error processing event.network.ApplicationGatewaysClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: Code="ApplicationGatewayProbeProtocolMustMatchBackendHttpSettinsProtocol" Message="Probe /subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/applicationGateways/xxxPublicGateway/probes/defaultprobe-Http protocol (Http) does not match Backend Http Setting /subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/applicationGateways/xxxPublicGateway/backendHttpSettingsCollection/bp-dev-consumer-marketing-site-443-443-marketing-site protocol (Https)." Details=[]