Azure / application-gateway-kubernetes-ingress

This is an ingress controller that can be run on Azure Kubernetes Service (AKS) to allow an Azure Application Gateway to act as the ingress for an AKS cluster.
https://azure.github.io/application-gateway-kubernetes-ingress
MIT License
677 stars 420 forks source link

Ingress pod is stuck at step Getting Application Gateway config #1534

Closed PaulIsProgramming closed 1 year ago

PaulIsProgramming commented 1 year ago

Describe the bug I followed the Brownfield install guide, and my ingress pod is not starting. With AGIC version 1.5 and 1.6 it gets stuck at Getting Application Gateway configuration (kubectl logs my-ingress-pod)

I0420 05:31:22.164850       1 utils.go:114] Using verbosity level 5 from environment variable APPGW_VERBOSITY_LEVEL
I0420 05:31:22.200218       1 environment.go:294] KUBERNETES_WATCHNAMESPACE is not set. Watching all available namespaces.
I0420 05:31:22.200240       1 main.go:118] Using User Agent Suffix='ingress-azure-5fc5b878b-tkxz7' when communicating with ARM
I0420 05:31:22.200322       1 main.go:137] Application Gateway Details: Subscription="ed9530cc-8deb-47cf-8bf5-c3f506f91e69" Resource Group="RG-FFX-MLP-SERVICES" Name="app_gw"
I0420 05:31:22.200333       1 auth.go:53] Creating authorizer from Azure Managed Service Identity
I0420 05:31:22.200480       1 httpserver.go:57] Starting API Server on :8123
I0420 05:31:22.201540       1 client.go:118] Getting Application Gateway configuration.

with version 1.7 it shows the same error as described in https://github.com/Azure/application-gateway-kubernetes-ingress/issues/1533

I0421 21:44:43.088378       1 utils.go:114] Using verbosity level 5 from environment variable APPGW_VERBOSITY_LEVEL
I0421 21:44:43.116018       1 supported_apiversion.go:70] server version is: 1.25.6
I0421 21:44:43.128208       1 environment.go:294] KUBERNETES_WATCHNAMESPACE is not set. Watching all available namespaces.
I0421 21:44:43.128233       1 main.go:118] Using User Agent Suffix='ingress-azure-85d48ccdc7-dfgxc' when communicating with ARM
I0421 21:44:43.128914       1 main.go:137] Application Gateway Details: Subscription="ed9530cc-8deb-47cf-8bf5-c3f506f91e69" Resource Group="RG-FFX-MLP-SERVICES" Name="app_gw"
I0421 21:44:43.128935       1 auth.go:58] Creating authorizer using Default Azure Credentials
I0421 21:44:43.129020       1 client.go:133] Getting Application Gateway configuration.
I0421 21:44:43.129545       1 httpserver.go:57] Starting API Server on :8123
E0421 21:44:44.131158       1 authorizer.go:46] Error getting Azure token: DefaultAzureCredential: failed to acquire a token.
Attempted credentials:
        EnvironmentCredential: missing environment variable AZURE_TENANT_ID
        WorkloadIdentityCredential: missing environment variables for workload identity. Check webhook and pod configuration
        ManagedIdentityCredential: IMDS token request timed out
        AzureCLICredential: Azure CLI not found on path
E0421 21:44:44.184937       1 client.go:184] configuration error (bad request) or unauthorized error while performing a GET using the authorizer
E0421 21:44:44.184963       1 client.go:185] stopping GET retries
F0421 21:44:44.188284       1 main.go:175] Failed getting Application Gateway: Code="ErrorApplicationGatewayUnexpectedStatusCode" Message="Unexpected status code '401' while performing a GET on Application Gateway." InnerError="network.ApplicationGatewaysClient#Get: Failure responding to request: StatusCode=401 -- Original Error: autorest/azure: Service returned an error. Status=401 Code="AuthenticationFailedMissingToken" Message="Authentication failed. The 'Authorization' header is missing the access token.""

To Reproduce Steps to reproduce the behavior:

Verbosity level of the App Gateway Ingress Controller

verbosityLevel: 5

################################################################################

Specify which application gateway the ingress controller will manage

# appgw: subscriptionId: aaaaaa-bbbb-cccc-8bf5-c3f506f91e69 resourceGroup: RG-MY-APPGATEWAY name: app_gw usePrivateIP: false

Setting appgw.shared to "true" will create an AzureIngressProhibitedTarget CRD.

This prohibits AGIC from applying config for any host/path.

Use "kubectl get AzureIngressProhibitedTargets" to view and change this.

shared: false

################################################################################

Specify which kubernetes namespace the ingress controller will watch

Default value is "default"

Leaving this variable out or setting it to blank or empty string would

result in Ingress Controller observing all acessible namespaces.

#

kubernetes:

watchNamespace:

################################################################################

Specify the authentication with Azure Resource Manager

#

Two authentication methods are available:

- Option 1: AAD-Pod-Identity (https://github.com/Azure/aad-pod-identity)

armAuth: type: aadPodIdentity identityResourceID: /subscriptions/xxxxxxx-xxxx-xxxx-xxxx-c3f506f91e69/resourcegroups/RG-MyCluster/providers/Microsoft.ManagedIdentity/userAssignedIdentities/my_user_assinged_identity identityClientID: xxxxxx-xxxx-xxxx-xxxx-1faab8e9xxxx

################################################################################

Specify if the cluster is RBAC enabled or not

rbac: enabled: true # true/false


***Potentailly important*** 
The application gateway is also (successfully) used by another cluster that uses the azure add-on for the agic installation.
That should not hinder the ingress pods in this cluster succesfully starting.

**Ingress Controller details**
* `kubectl log ingress-pod` output

I0419 21:59:33.186543 1 utils.go:114] Using verbosity level 5 from environment variable APPGW_VERBOSITY_LEVEL I0419 21:59:33.222255 1 environment.go:294] KUBERNETES_WATCHNAMESPACE is not set. Watching all available namespaces. I0419 21:59:33.222277 1 main.go:118] Using User Agent Suffix='ingress-azure-5fc5b878b-tkxz7' when communicating with ARM I0419 21:59:33.222400 1 main.go:137] Application Gateway Details: Subscription="aaaaaa-bbbb-cccc-8bf5-c3f506f91e69" Resource Group="RG-MY-APPGATEWAY" Name="app_gw" I0419 21:59:33.222412 1 auth.go:53] Creating authorizer from Azure Managed Service Identity I0419 21:59:33.222560 1 httpserver.go:57] Starting API Server on :8123 I0419 21:59:33.223603 1 client.go:118] Getting Application Gateway configuration.


* `kubectl describe pod ingress-pod` output

Name: ingress-azure-5fc5b878b-tkxz7 Namespace: default Priority: 0 Service Account: ingress-azure Node: aks-agentpool-31797117-vmss000002/10.0.0.4 Start Time: Wed, 19 Apr 2023 22:26:54 +0200 Labels: aadpodidbinding=ingress-azure app=ingress-azure pod-template-hash=5fc5b878b release=ingress-azure Annotations: checksum/config: e492d52df1fe1855b3e88a0c5e4fa7535a58f2656dea21f3194f07351e503d9c prometheus.io/port: 8123 prometheus.io/scrape: true Status: Running IP: 10.0.0.108 IPs: IP: 10.0.0.108 Controlled By: ReplicaSet/ingress-azure-5fc5b878b Containers: ingress-azure: Container ID: containerd://b3ef558724ea66e0d395f9a85a5fcce51f4ba7217d495636492955bcb41f2928 Image: mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.5.0 Image ID: mcr.microsoft.com/azure-application-gateway/kubernetes-ingress@sha256:d3f2df83ef3e93acbd127208fcca607dbf825fa90a75de7e218aa05960555c97 Port: Host Port: State: Running Started: Wed, 19 Apr 2023 23:59:33 +0200 Last State: Terminated Reason: Error Exit Code: 255 Started: Wed, 19 Apr 2023 23:49:32 +0200 Finished: Wed, 19 Apr 2023 23:58:04 +0200 Ready: False Restart Count: 10 Liveness: http-get http://:8123/health/alive delay=15s timeout=1s period=20s #success=1 #failure=3 Readiness: http-get http://:8123/health/ready delay=5s timeout=1s period=10s #success=1 #failure=3 Environment Variables from: ingress-azure ConfigMap Optional: false Environment: AZURE_CLOUD_PROVIDER_LOCATION: /etc/appgw/azure.json AGIC_POD_NAME: ingress-azure-5fc5b878b-tkxz7 (v1:metadata.name) AGIC_POD_NAMESPACE: default (v1:metadata.namespace) Mounts: /etc/appgw/ from azure (ro) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9jnss (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: azure: Type: HostPath (bare host directory volume) Path: /etc/kubernetes/ HostPathType: Directory kube-api-access-9jnss: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: BestEffort Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message


Warning BackOff 38m (x21 over 76m) kubelet Back-off restarting failed container Warning Unhealthy 3m39s (x564 over 93m) kubelet Readiness probe failed: Get "http://10.0.0.108:8123/health/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)


I have set `verbosityLevel` to 5 but that did not change the log output.

*  `kubectl get event` output

51s Warning Unhealthy pod/ingress-azure-7678cfcd4-9hmmw Readiness probe failed: Get "http://10.0.0.33:8123/health/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers) 25m Warning BackOff pod/ingress-azure-7678cfcd4-9hmmw Back-off restarting failed container



It sounds similar to https://github.com/Azure/application-gateway-kubernetes-ingress/issues/1459 but in this report there is some additional log output.

I need to connect multiple clusters to the app gateway so I cannot use the azure add on.

**Questions**
* I am not sure whether it succesfully got the Application Gateway configuration since that is the last line in the log output. 
* I would love to know where to look/what log to go through to know where I am stuck as I don't have any error log to follow up on
* Fact is `kubectl get pods` shows that the pods are running but not ready but I can't seem to find any hint as to why that is 
```default             ingress-azure-7678cfcd4-9hmmw            0/1     Running   9 (9m14s ago)    91m```

Any suggestions are highly appreciated.
PaulIsProgramming commented 1 year ago

I got it to work with the service principle and AGIC 1.7. I had to give the the service principle Contributor access to you App Gateway as described in the brownfield tutorial section:" AAD Pod Identity" bullet point 3 (althoigh I was using service principle).