Closed AndrewJR350 closed 3 weeks ago
Hello @AndrewJR350,
Do you have a link to the document you are following? The helm installation command should have no dependency on ARM.
What errors are you seeing on pod start?
Hello @AndrewJR350,
Do you have a link to the document you are following? The helm installation command should have no dependency on ARM.
What errors are you seeing on pod start?
Hi @JackStromberg
I was following this documentation. This created the pod, and I was able to list the ALB controller using the command kubectl get pods -n azure-alb-system.
However, the pod never started. It pulled the image successfully, but when I checked the logs and state of the pod it was in a crash loop with the following error Error: exec user process caused "exec format error"
I checked the manifest of the image installed via Helm, and it doesn't mention anything platform-specific. I was not able to find any code based on this image as well.
What is your AKS cluster version?
What is the output when describing the bootstrap pod?
kubectl describe pod alb-controller-bootstrap-<unique-id> -n azure-alb-system
Hi @JackStromberg This is the error we see on the pod
init-alb-controller-crds exec /usr/bin/sh: exec format error
This is the result of the command you requested:
kubectl describe pod alb-controller-bootstrap-778f96cfb4-mhdlw -n azure-alb-system
Name: alb-controller-bootstrap-778f96cfb4-mhdlw
Namespace: azure-alb-system
Priority: 2000001000
Priority Class Name: system-node-critical
Service Account: alb-controller-sa
Node: aks-userpool-22578432-vmss000000/10.0.0.113
Start Time: Wed, 18 Sep 2024 23:22:57 +0200
Labels: app=alb-controller-bootstrap
pod-template-hash=778f96cfb4
Annotations: kubernetes.azure.com/set-kube-service-host-fqdn: true
prometheus.io/port: 9002
prometheus.io/scrape: true
Status: Pending
IP: 10.0.0.132
IPs:
IP: 10.0.0.132
Controlled By: ReplicaSet/alb-controller-bootstrap-778f96cfb4
Init Containers:
init-alb-controller-crds:
Container ID: containerd://464c73fa9d9eb27dbb3c4f0ee5dce9682dbe45bda9fc5a75c7378bc1ee6e23d5
Image: mcr.microsoft.com/application-lb/images/alb-controller-crds:1.2.3
Image ID: mcr.microsoft.com/application-lb/images/alb-controller-crds@sha256:71dc7b7cc810a8eefb5d2fc12253a2aac42785483277101b93d90e83761aa218
Port: <none>
Host Port: <none>
Command:
sh
-c
kubectl apply -f /alb-controller-crds/agc-crds; kubectl apply -f /alb-controller-crds/gateway-api-crds;
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Wed, 18 Sep 2024 23:28:49 +0200
Finished: Wed, 18 Sep 2024 23:28:49 +0200
Ready: False
Restart Count: 6
Environment:
KUBERNETES_SERVICE_HOST: appkube-dns-7bp3b9jc.hcp.eastus.azmk8s.io
KUBERNETES_PORT: tcp://appkube-dns-7bp3b9jc.hcp.eastus.azmk8s.io:443
KUBERNETES_PORT_443_TCP: tcp://appkube-dns-7bp3b9jc.hcp.eastus.azmk8s.io:443
KUBERNETES_PORT_443_TCP_ADDR: appkube-dns-7bp3b9jc.hcp.eastus.azmk8s.io
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-cl84l (ro)
Containers:
alb-controller-bootstrap:
Container ID:
Image: mcr.microsoft.com/application-lb/images/alb-controller-bootstrap:1.2.3
Image ID:
Port: 9005/TCP
Host Port: 0/TCP
Command:
/alb-controller-bootstrap
Args:
--log-level
info
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Limits:
cpu: 200m
memory: 128Mi
Requests:
cpu: 100m
memory: 128Mi
Liveness: http-get http://:9005/healthz delay=5s timeout=5s period=10s #success=1 #failure=3
Readiness: http-get http://:9005/healthz delay=5s timeout=5s period=10s #success=1 #failure=3
Environment:
KUBERNETES_SERVICE_HOST: appkube-dns-7bp3b9jc.hcp.eastus.azmk8s.io
KUBERNETES_PORT: tcp://appkube-dns-7bp3b9jc.hcp.eastus.azmk8s.io:443
KUBERNETES_PORT_443_TCP: tcp://appkube-dns-7bp3b9jc.hcp.eastus.azmk8s.io:443
KUBERNETES_PORT_443_TCP_ADDR: appkube-dns-7bp3b9jc.hcp.eastus.azmk8s.io
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-cl84l (ro)
Conditions:
Type Status
PodReadyToStartContainers True
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-cl84l:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 10m default-scheduler Successfully assigned azure-alb-system/alb-controller-bootstrap-778f96cfb4-mhdlw to aks-userpool-22578432-vmss000000
Normal Pulling 10m kubelet Pulling image "mcr.microsoft.com/application-lb/images/alb-controller-crds:1.2.3"
Normal Pulled 10m kubelet Successfully pulled image "mcr.microsoft.com/application-lb/images/alb-controller-crds:1.2.3" in 5.681s (5.681s including waiting). Image size: 22749663 bytes.
Normal Created 8m28s (x5 over 10m) kubelet Created container init-alb-controller-crds
Normal Started 8m28s (x5 over 10m) kubelet Started container init-alb-controller-crds
Normal Pulled 8m28s (x4 over 10m) kubelet Container image "mcr.microsoft.com/application-lb/images/alb-controller-crds:1.2.3" already present on machine
Warning BackOff 1s (x47 over 10m) kubelet Back-off restarting failed container init-alb-controller-crds in pod alb-controller-bootstrap-778f96cfb4-mhdlw_azure-alb-system(d704c403-e510-44d8-ac2b-ead7efc43511)
Upon deeper inspection I think the issue is with the image the controller is using.
The controller is using
mcr.microsoft.com/application-lb/images/alb-controller-crds:1.2.3
which doesnt have an arm equivalent
The image internally on the step 3 does:
RUN /bin/sh -c wget https://storage.googleapis.com/kubernetes-release/release/v1.30.1/bin/linux/amd64/kubectl -O /bin/kubectl && chmod +x /bin/kubectl
And here the architecture is hard coded to amd64 of kubectl which when executed on the ARM cluster breaks.
Hope this was helpful
Docker inspect of the image url for the controller paints a similar picture as well
docker inspect mcr.microsoft.com/application-lb/images/alb-controller-crds:1.2.3
[
{
"Id": "sha256:71dc7b7cc810a8eefb5d2fc12253a2aac42785483277101b93d90e83761aa218",
"RepoTags": [
"mcr.microsoft.com/application-lb/images/alb-controller-crds:1.2.3"
],
"RepoDigests": [
"mcr.microsoft.com/application-lb/images/alb-controller-crds@sha256:71dc7b7cc810a8eefb5d2fc12253a2aac42785483277101b93d90e83761aa218"
],
"Parent": "",
"Comment": "buildkit.dockerfile.v0",
"Created": "2024-08-30T16:36:00.029033041Z",
"DockerVersion": "27.2.0",
"Author": "",
"Config": {
"Hostname": "",
"Domainname": "",
"User": "",
"AttachStdin": false,
"AttachStdout": false,
"AttachStderr": false,
"Tty": false,
"OpenStdin": false,
"StdinOnce": false,
"Env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
],
"Cmd": null,
"ArgsEscaped": true,
"Image": "",
"Volumes": null,
"WorkingDir": "/",
"Entrypoint": [
"/bin/kubectl"
],
"OnBuild": null,
"Labels": {
"com.visualstudio.msazure.image.build.buildnumber": "1.2.3",
"com.visualstudio.msazure.image.build.builduri": "vstfs:///Build/Build/102031627",
"com.visualstudio.msazure.image.build.definitionname": "Networking-Kubic-Official",
"com.visualstudio.msazure.image.build.repository.name": "Networking-Kubic",
"com.visualstudio.msazure.image.build.repository.uri": "https://msazure.visualstudio.com/One/_git/Networking-Kubic",
"com.visualstudio.msazure.image.build.sourcebranchname": "1.2",
"com.visualstudio.msazure.image.build.sourceversion": "3c7c80cee2980b1099dd12b97d7bc24acf9490bb",
"com.visualstudio.msazure.image.system.teamfoundationcollectionuri": "https://msazure.visualstudio.com/",
"com.visualstudio.msazure.image.system.teamproject": "One",
"image.base.digest": "sha256:e01de8a38d8a6ea1bc7212b4875b084bbb12dc3b7d93c570231a61887e04e5c8",
"image.base.ref.name": "mcr.microsoft.com/cbl-mariner/busybox:1.35"
}
},
"Architecture": "amd64",
"Os": "linux",
"Size": 22749663,
"GraphDriver": {
"Data": null,
"Name": "overlayfs"
},
"RootFS": {
"Type": "layers",
"Layers": [
"sha256:fa7546a223a4f6cf426563f87ff81033a891a2a4048a99e987026ff7753440fe",
"sha256:9f2106c783cbea22d069ff58967fa5891579bfe08f15af51f1d9d0e283c23715",
"sha256:095655321eed6ff423681ae396f527e1c7af63296a56a73e7444002919bacad7",
"sha256:d8a13698cfbcda3399b049841cd19985e40bf32e1dba1833aa221a11f63abd3b"
]
},
"Metadata": {
"LastTagTime": "2024-09-18T21:24:56.720106714Z"
}
}
]
Sorry, I misunderstood as ARM (Azure Resource Manager), not ARM in the context of compute.
Unfortunately, AGC does not support ARM based compute today. Will update our docs and have added this as a future feature item to our backlog.
Thank you for bubbling up!
Thanks @JackStromberg Is the source code for the controller publicly available? If possible we could lend a hand and contribute to it
@a7ul, apologies for the slow response. Appreciate the offer, however ALB Controller is closed source. We don't have plans on making it open source at this time. The ask for supporting ARM architecture is certainly valid and I've labeled as an item needed for AGIC to AGC parity.
Thank you again for bubbling up!
+1 for arm support. this is the only workload on our cluster requiring us to spin up x86 node
Describe the bug The Application Gateway for Containers' ALB Controller is not functioning on ARM architecture when following the provided documentation.
To Reproduce Steps to reproduce the behavior:
Expected behavior The ALB Controller installation should start and register itself as healthy.
Environment (please complete the following information):