Azure / application-gateway-kubernetes-ingress

This is an ingress controller that can be run on Azure Kubernetes Service (AKS) to allow an Azure Application Gateway to act as the ingress for an AKS cluster.
https://azure.github.io/application-gateway-kubernetes-ingress
MIT License
678 stars 422 forks source link

SSL Certificates are not pruned by AGIC #1488

Open jkroepke opened 1 year ago

jkroepke commented 1 year ago

Describe the bug Today, we face the issue that the limit of SSL certificates are 100 while we have not 100 Ingress objects deployed.

To Reproduce Steps to reproduce the behavior:

create a lot of ingress objects with (appgw.ingress.kubernetes.io/ssl-redirect: "true") including associated services and backend pods in a row and remove them again. Create them again.

After a while, I can set a lot of certificates through

az network application-gateway ssl-cert list --gateway-name d-customer-app-agw-01 --resource-group d-customer-app-rg

I'm also able to delete the staled certificated which guarantee that the ssl certificates are not used by any listener

Ingress Controller details

Name:             agic-ingress-azure-7f6bf479b5-qptnd
Namespace:        agic
Priority:         0
Service Account:  agic-sa-ingress-azure
Node:             aks-user01-82964380-vmss00000d/172.24.0.4
Start Time:       Wed, 11 Jan 2023 10:04:40 +0100
Labels:           app=ingress-azure
                  pod-template-hash=7f6bf479b5
                  release=agic
Annotations:      checksum/config: 83f22312d0e6a21fbde286f3da7429f4ed7b79d725958909572ee9a6e4a7af00
                  cni.projectcalico.org/containerID: 778389587b6a30b18ac4a82e112ee58e2e0d2bb6609cafcfd88ca0d9d3af1188
                  cni.projectcalico.org/podIP: 100.127.0.67/32
                  cni.projectcalico.org/podIPs: 100.127.0.67/32
                  prometheus.io/port: 8123
                  prometheus.io/scrape: true
Status:           Running
IP:               100.127.0.67
IPs:
  IP:           100.127.0.67
Controlled By:  ReplicaSet/agic-ingress-azure-7f6bf479b5
Containers:
  ingress-azure:
    Container ID:   containerd://8e140638681bbbd42aaca5bd178001d8d0588b2e1f8c580e7ae5b745dba120bf
    Image:          mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.5.2
    Image ID:       mcr.microsoft.com/azure-application-gateway/kubernetes-ingress@sha256:31a876143de3aca583f0508c0eb0d2a69e1d3da21dba003ca0bdfc2434f807bd
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Wed, 11 Jan 2023 10:04:44 +0100
    Ready:          True
    Restart Count:  0
    Liveness:       http-get http://:8123/health/alive delay=15s timeout=1s period=20s #success=1 #failure=3
    Readiness:      http-get http://:8123/health/ready delay=5s timeout=1s period=10s #success=1 #failure=3
    Environment Variables from:
      agic-cm-ingress-azure  ConfigMap  Optional: false
    Environment:
      AZURE_CLOUD_PROVIDER_LOCATION:  /etc/appgw/azure.json
      AGIC_POD_NAME:                  agic-ingress-azure-7f6bf479b5-qptnd (v1:metadata.name)
      AGIC_POD_NAMESPACE:             agic (v1:metadata.namespace)
      AZURE_AUTH_LOCATION:            /etc/Azure/Networking-AppGW/auth/armAuth.json
    Mounts:
      /etc/Azure/Networking-AppGW/auth from networking-appgw-k8s-azure-service-principal-mount (ro)
      /etc/appgw/ from azure (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-6gk9g (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  azure:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/kubernetes/
    HostPathType:  Directory
  networking-appgw-k8s-azure-service-principal-mount:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  networking-appgw-k8s-azure-service-principal
    Optional:    false
  kube-api-access-6gk9g:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                     Age                   From                       Message
  ----     ------                     ----                  ----                       -------
  Warning  FailedApplyingAppGwConfig  10s (x11 over 4m54s)  azure/application-gateway  network.ApplicationGatewaysClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: Code="ApplicationGatewayRequestRoutingRulePriorityMustBeUnique" Message="Priority must be unique across all the request routing rules. Rules /subscriptions/3c668f12-0000-4844-861c-bf5364669149/resourceGroups/d-customer-app-rg/providers/Microsoft.Network/applicationGateways/d-customer-app-agw-01/requestRoutingRules/rr-f004bf78756cc24eb40a86bd77bd61d0 and /subscriptions/3c668f12-0000-4844-861c-bf5364669149/resourceGroups/d-customer-app-rg/providers/Microsoft.Network/applicationGateways/d-customer-app-agw-01/requestRoutingRules/rr-f09031d6b7be056016562d636185a3c2 cannot have the same priority 20000." Details=[]

E0111 10:51:48.645878 1 worker.go:62] Error processing event.network.ApplicationGatewaysClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: Code=\"ApplicationGatewaySslCertLimitReached\" Message=\"The number of SSL certificates exceeds the maximum allowed value. The number of SSL certificates is 112 and the maximum allowed is 100.\" Details=[]

jkroepke commented 1 year ago

ref: https://github.com/Azure/application-gateway-kubernetes-ingress/issues/1228

slushysnowman commented 1 year ago

We're also experiencing this, the expectation would be that AGIC cleans up certificates, why is that not the case?

EvolutionOli commented 1 year ago

Shame this is still not fixed. We are using ingress with SSL and have hit this limit several times now.

jkroepke commented 1 year ago

After the announcement of Application Gateway for Containers, I may expect a deprecation for AGIC anyways.

ptemmer commented 2 weeks ago

Same issue here. Any update?