Closed aspnet4you closed 4 years ago
Hi, would you help share the logs from the controller pod?
BTW, where is your controller running? if it's running as a fargate pod itself, you need to specify --aws-vpc-id and --aws-region
@M00nF1sh, Thank you for responding to my question. Unfortunately, I didn't check the logs in the ingress controller before deleting the eks cluster. Any suggestion before I retry eks fargate with alb?
The ingress controllers (pods) were running in kube-system namespace. I did specify was-vpc-id and aws-region in the deployment yaml. For this pic, I didn't have any node group, just a fargate profile. Here is my ingress yaml, https://raw.githubusercontent.com/aspnet4you/eks-fargate-poc/master/alb-ingress-controller.yaml
@aspnet4you Pure Fargate(without any node group) should works fine. (i tested v1.1.4 which you are using works fine). One tip is change v1.1.4 to v1.1.6 for latest code(but none of these fixes is related to your issue).
From the controller-log, you should see what's wrong, typically it's iam permission or a subnet misttaged.
@M00nF1sh, Thanks for the suggestion. I will try the latest version.
I was overly cautious on subnet tags and both the public and private pairs were tagged correctly. Learned that from previous poc with eks and ec2! Matter of fact, eksctl tool did that for me with security groups wide open to all traffic all ports!
@M00nF1sh , Below is what I see in the logs and no ALB! Can't make anything out of the logs. What can possibly go wrong? I downloaded the latest IAM policy from github.
kubectl logs -p alb-ingress-controller-5db898488b-bqrf6 -n kube-system
W0324 00:42:59.659618 1 client_config.go:549] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
E0324 00:43:29.660449 1 manager.go:173] kubebuilder/manager "msg"="Failed to get API Group-Resources" "error"="Get https://10.100.0.1:443/api?timeout=32s: dial tcp 10.100.0.1:443: i/o timeout"
F0324 00:43:29.660488 1 main.go:84] Get https://10.100.0.1:443/api?timeout=32s: dial tcp 10.100.0.1:443: i/o timeout
Thanks, Prodip
@M00nF1sh : More logs.. see the attached file for formatted logs. kubectl logs -f alb-ingress-controller-5db898488b-bqrf6 -n kube-system
W0324 00:43:30.859177 1 client_config.go:549] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work. I0324 00:43:30.970685 1 controller.go:121] kubebuilder/controller "level"=0 "msg"="Starting EventSource" "controller"="alb-ingress-controller" "source"={"Type":{"metadata":{"creationTimestamp":null}} } I0324 00:43:30.970902 1 controller.go:121] kubebuilder/controller "level"=0 "msg"="Starting EventSource" "controller"="alb-ingress-controller" "source"={"Type":{"metadata":{"creationTimestamp":null}, "spec":{},"status":{"loadBalancer":{}}}} I0324 00:43:30.970963 1 controller.go:121] kubebuilder/controller "level"=0 "msg"="Starting EventSource" "controller"="alb-ingress-controller" "source"= I0324 00:43:30.971098 1 controller.go:121] kubebuilder/controller "level"=0 "msg"="Starting EventSource" "controller"="alb-ingress-controller" "source"={"Type":{"metadata":{"creationTimestamp":null}, "spec":{},"status":{"loadBalancer":{}}}} I0324 00:43:30.971131 1 controller.go:121] kubebuilder/controller "level"=0 "msg"="Starting EventSource" "controller"="alb-ingress-controller" "source"= I0324 00:43:30.971266 1 controller.go:121] kubebuilder/controller "level"=0 "msg"="Starting EventSource" "controller"="alb-ingress-controller" "source"={"Type":{"metadata":{"creationTimestamp":null}} } I0324 00:43:30.971574 1 controller.go:121] kubebuilder/controller "level"=0 "msg"="Starting EventSource" "controller"="alb-ingress-controller" "source"={"Type":{"metadata":{"creationTimestamp":null}, "spec":{},"status":{"daemonEndpoints":{"kubeletEndpoint":{"Port":0}},"nodeInfo":{"machineID":"","systemUUID":"","bootID":"","kernelVersion":"","osImage":"","containerRuntimeVersion":"","kubeletVersion":""," kubeProxyVersion":"","operatingSystem":"","architecture":""}}}} I0324 00:43:31.044029 1 leaderelection.go:205] attempting to acquire leader lease kube-system/ingress-controller-leader-alb... I0324 00:43:31.057484 1 leaderelection.go:214] successfully acquired lease kube-system/ingress-controller-leader-alb I0324 00:43:31.057674 1 recorder.go:53] kubebuilder/manager/events "level"=1 "msg"="Normal" "message"="alb-ingress-controller-5db898488b-bqrf6_7bd33a30-6d68-11ea-994e-7290c1c88576 became leader" "obj ect"={"kind":"ConfigMap","namespace":"kube-system","name":"ingress-controller-leader-alb","uid":"7bdf9bad-6d68-11ea-8108-0a9dec12172d","apiVersion":"v1","resourceVersion":"4864"} "reason"="LeaderElection" I0324 00:43:31.158073 1 controller.go:134] kubebuilder/controller "level"=0 "msg"="Starting Controller" "controller"="alb-ingress-controller" I0324 00:43:31.258364 1 controller.go:154] kubebuilder/controller "level"=0 "msg"="Starting workers" "controller"="alb-ingress-controller" "worker count"=1 W0324 00:51:50.249271 1 reflector.go:270] pkg/mod/k8s.io/client-go@v0.0.0-20181213151034-8d9ed539ba31/tools/cache/reflector.go:95: watch of *v1.Secret ended with: too old resource version: 3846 (6237) E0324 01:26:23.432194 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="no object matching key \"default/aspnetapp-ingress\" in local store" "controller"="alb-ingress-cont roller" "request"={"Namespace":"default","Name":"aspnetapp-ingress"} E0324 01:27:10.067226 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to build LoadBalancer configuration due to unable to fetch subnets. Error: WebIdentityErr: fa iled to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post https://sts.'us-east-1'.amazonaws.com/: dial tcp: lookup sts.'us-east-1'.amazonaws.com: no such host" "controller "="alb-ingress-controller" "request"={"Namespace":"default","Name":"aspnetapp-ingress"} E0324 01:27:56.072913 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to build LoadBalancer configuration due to unable to fetch subnets. Error: WebIdentityErr: fa iled to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post https://sts.'us-east-1'.amazonaws.com/: dial tcp: lookup sts.'us-east-1'.amazonaws.com: no such host" "controller "="alb-ingress-controller" "request"={"Namespace":"default","Name":"aspnetapp-ingress"} E0324 01:28:47.180817 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to build LoadBalancer configuration due to unable to fetch subnets. Error: WebIdentityErr: fa iled to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post https://sts.'us-east-1'.amazonaws.com/: dial tcp: lookup sts.'us-east-1'.amazonaws.com: no such host" "controller "="alb-ingress-controller" "request"={"Namespace":"default","Name":"aspnetapp-ingress"} E0324 01:29:37.624242 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to build LoadBalancer configuration due to unable to fetch subnets. Error: WebIdentityErr: fa iled to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post https://sts.'us-east-1'.amazonaws.com/: dial tcp: lookup sts.'us-east-1'.amazonaws.com: no such host" "controller "="alb-ingress-controller" "request"={"Namespace":"default","Name":"aspnetapp-ingress"} E0324 01:30:13.205391 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to build LoadBalancer configuration due to unable to fetch subnets. Error: WebIdentityErr: fa iled to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post https://sts.'us-east-1'.amazonaws.com/: dial tcp: lookup sts.'us-east-1'.amazonaws.com: no such host" "controller "="alb-ingress-controller" "request"={"Namespace":"default","Name":"aspnetapp-ingress"} E0324 01:30:51.391739 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to build LoadBalancer configuration due to unable to fetch subnets. Error: WebIdentityErr: fa iled to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post https://sts.'us-east-1'.amazonaws.com/: dial tcp: lookup sts.'us-east-1'.amazonaws.com: no such host" "controller "="alb-ingress-controller" "request"={"Namespace":"default","Name":"aspnetapp-ingress"} E0324 01:31:32.773034 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to build LoadBalancer configuration due to unable to fetch subnets. Error: WebIdentityErr: fa iled to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post https://sts.'us-east-1'.amazonaws.com/: dial tcp: lookup sts.'us-east-1'.amazonaws.com: no such host" "controller "="alb-ingress-controller" "request"={"Namespace":"default","Name":"aspnetapp-ingress"} E0324 01:32:21.837140 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to build LoadBalancer configuration due to unable to fetch subnets. Error: WebIdentityErr: fa iled to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post https://sts.'us-east-1'.amazonaws.com/: dial tcp: lookup sts.'us-east-1'.amazonaws.com: no such host" "controller "="alb-ingress-controller" "request"={"Namespace":"default","Name":"aspnetapp-ingress"} E0324 01:33:12.075720 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to build LoadBalancer configuration due to unable to fetch subnets. Error: WebIdentityErr: fa iled to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post https://sts.'us-east-1'.amazonaws.com/: dial tcp: lookup sts.'us-east-1'.amazonaws.com: no such host" "controller "="alb-ingress-controller" "request"={"Namespace":"default","Name":"aspnetapp-ingress"} E0324 01:33:50.826910 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to build LoadBalancer configuration due to unable to fetch subnets. Error: WebIdentityErr: fa iled to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post https://sts.'us-east-1'.amazonaws.com/: dial tcp: lookup sts.'us-east-1'.amazonaws.com: no such host" "controller "="alb-ingress-controller" "request"={"Namespace":"default","Name":"aspnetapp-ingress"} E0324 01:34:37.774838 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to build LoadBalancer configuration due to unable to fetch subnets. Error: WebIdentityErr: fa iled to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post https://sts.'us-east-1'.amazonaws.com/: dial tcp: lookup sts.'us-east-1'.amazonaws.com: no such host" "controller "="alb-ingress-controller" "request"={"Namespace":"default","Name":"aspnetapp-ingress"} E0324 01:35:28.136156 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to build LoadBalancer configuration due to unable to fetch subnets. Error: WebIdentityErr: fa iled to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post https://sts.'us-east-1'.amazonaws.com/: dial tcp: lookup sts.'us-east-1'.amazonaws.com: no such host" "controller "="alb-ingress-controller" "request"={"Namespace":"default","Name":"aspnetapp-ingress"} E0324 01:36:28.244697 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to build LoadBalancer configuration due to unable to fetch subnets. Error: WebIdentityErr: fa iled to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post https://sts.'us-east-1'.amazonaws.com/: dial tcp: lookup sts.'us-east-1'.amazonaws.com: no such host" "controller "="alb-ingress-controller" "request"={"Namespace":"default","Name":"aspnetapp-ingress"} E0324 01:38:00.702172 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to build LoadBalancer configuration due to unable to fetch subnets. Error: WebIdentityErr: fa iled to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post https://sts.'us-east-1'.amazonaws.com/: dial tcp: lookup sts.'us-east-1'.amazonaws.com: no such host" "controller "="alb-ingress-controller" "request"={"Namespace":"default","Name":"aspnetapp-ingress"} E0324 01:40:14.999806 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to build LoadBalancer configuration due to unable to fetch subnets. Error: WebIdentityErr: fa iled to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post https://sts.'us-east-1'.amazonaws.com/: dial tcp: lookup sts.'us-east-1'.amazonaws.com: no such host" "controller "="alb-ingress-controller" "request"={"Namespace":"default","Name":"aspnetapp-ingress"} E0324 01:43:45.026893 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to build LoadBalancer configuration due to unable to fetch subnets. Error: WebIdentityErr: fa iled to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post https://sts.'us-east-1'.amazonaws.com/: dial tcp: lookup sts.'us-east-1'.amazonaws.com: no such host" "controller "="alb-ingress-controller" "request"={"Namespace":"default","Name":"aspnetapp-ingress"} E0324 01:44:34.766833 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to build LoadBalancer configuration due to unable to fetch subnets. Error: WebIdentityErr: fa iled to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post https://sts.'us-east-1'.amazonaws.com/: dial tcp: lookup sts.'us-east-1'.amazonaws.com: no such host" "controller "="alb-ingress-controller" "request"={"Namespace":"default","Name":"aspnetapp-ingress"} E0324 01:50:01.950738 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to build LoadBalancer configuration due to unable to fetch subnets. Error: WebIdentityErr: fa iled to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post https://sts.'us-east-1'.amazonaws.com/: dial tcp: lookup sts.'us-east-1'.amazonaws.com: no such host" "controller "="alb-ingress-controller" "request"={"Namespace":"default","Name":"aspnetapp-ingress"}
Private subnet tagged by eksctl, looks fine to me-
Public subnet tagged by eksctl, looks fine to me-
Hi @M00nF1sh, Any idea what may be wrong with my configuration? Looks like EKS Farget in not mature enough for production when it comes to ingress!
Thanks, Prodip
@aspnet4you
apparently the real cause of your issue is dial tcp: lookup sts.'us-east-1'.amazonaws.com: no such host" "controller
, did your VPC have an internet GW or nat GW?
Note: even with Fargate, the internet requests for your pods will still use your VPC(we dropped a ENI in your vpc)
also, specify these settings without the quote:
- --cluster-name='eks-fargate-alb-ingress-demo'
- --aws-vpc-id='vpc-057af016ed6507b52'
- --aws-region='us-east-1'
to
- --cluster-name=eks-fargate-alb-ingress-demo
- --aws-vpc-id=vpc-057af016ed6507b52
- --aws-region=us-east-1
You can see the error message of sts.'us-east-1'.amazonaws.com
, where even region is quoted
@M00nF1sh, You are smart. 💯 That was it! I removed the quotes and alb provisioned as designed. You can close the issue.
I liked how alb auto adjusts the target backed. I changed the scaleset from 2 to 3 pods and I can see new IP is auto added to the target. Nice. :)- This is the reason I didn't want to add alb manually and deal with the auto scaling.
Here is my ingress definition:
Ingress resource definition:
Thanks, Prodip
cool, glad it works :D
I have the exact same issue, I can't figure out what's causing it.
Pod Logs:
{"level":"error","ts":1627148976.691803,"logger":"controller","msg":"Reconciler error","controller":"ingress","name":"hello","namespace":"default","error":"couldn't auto-discover subnets: WebIdentityErr: failed to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post \"https://sts.us-west-2.amazonaws.com/\": dial tcp: lookup sts.us-west-2.amazonaws.com on 172.20.0.10:53: read udp 10.0.3.184:34703->172.20.0.10:53: read: connection refused"}
{"level":"error","ts":1627149158.8136048,"logger":"controller","msg":"Reconciler error","controller":"ingress","name":"hello","namespace":"default","error":"couldn't auto-discover subnets: WebIdentityErr: failed to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post \"https://sts.us-west-2.amazonaws.com/\": dial tcp: lookup sts.us-west-2.amazonaws.com on 172.20.0.10:53: read udp 10.0.3.184:52341->172.20.0.10:53: read: connection refused"}
{"level":"error","ts":1627149331.7705815,"logger":"controller","msg":"Reconciler error","controller":"ingress","name":"hello","namespace":"default","error":"couldn't auto-discover subnets: WebIdentityErr: failed to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post \"https://sts.us-west-2.amazonaws.com/\": dial tcp: lookup sts.us-west-2.amazonaws.com on 172.20.0.10:53: read udp 10.0.3.184:58778->172.20.0.10:53: read: connection refused"}
{"level":"error","ts":1627149528.279761,"logger":"controller","msg":"Reconciler error","controller":"ingress","name":"hello","namespace":"default","error":"couldn't auto-discover subnets: WebIdentityErr: failed to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post \"https://sts.us-west-2.amazonaws.com/\": dial tcp: lookup sts.us-west-2.amazonaws.com on 172.20.0.10:53: read udp 10.0.3.184:55073->172.20.0.10:53: read: connection refused"}
{"level":"error","ts":1627149707.0748882,"logger":"controller","msg":"Reconciler error","controller":"ingress","name":"hello","namespace":"default","error":"couldn't auto-discover subnets: WebIdentityErr: failed to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post \"https://sts.us-west-2.amazonaws.com/\": dial tcp: lookup sts.us-west-2.amazonaws.com on 172.20.0.10:53: read udp 10.0.3.184:48301->172.20.0.10:53: read: connection refused"}
Container args:
Args:
--cluster-name=app-rylqFOXa
--ingress-class=alb
--aws-region=us-west-2
--aws-vpc-id=vpc-0e200d3ae7e12447c
Role policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::203341958641:oidc-provider/oidc.eks.us-west-2.amazonaws.com/id/2917B2CCF25A5DC470EF1CF5DB059AE9"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"oidc.eks.us-west-2.amazonaws.com/id/2917B2CCF25A5DC470EF1CF5DB059AE9:sub": "system:serviceaccount:kube-system:aws-load-balancer-controller"
}
}
}
]
}
The public subnets tagged with:
kubernetes.io/role/elb 1
kubernetes.io/cluster/app-rylqFOXa shared
Private are basically the same, but with elb-internal
. I'm trying to try out fargate as a POC for work. What might I be missing here?
@zquintana, Your issue is little different than what I was facing. Your controller definition looks ok.
Do you want to double check your vpc subnet tags for private subnet? As per documentation, it should be internal-elb and not elb-internal. https://aws.amazon.com/premiumsupport/knowledge-center/eks-vpc-subnet-discovery/
Key: kubernetes.io/role/internal-elb Value: 1
Things may have changed a bit since I performed the poc. I have all the supporting files in github.com and entrypoint is https://github.com/aspnet4you/eks-fargate-poc/blob/master/eks-fargate-alb-ingress-v2.ps1
@aspnet4you , yea looks like it's internal-elb
. Typo. I'm using the official AWS helm chart.
Turns out my issue was this https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/1360, core dns wasn't setup for fargate only cluster.
I was trying to follow the documentation below to create an alb-ingress-controller with ingress resources- https://aws.amazon.com/blogs/containers/using-alb-ingress-controller-with-amazon-eks-on-fargate/
It's supposed to create an alb and bind the address field of Kubernetes ingress but the address field of ingress is empty! No error. Fargate profile has been given proper IAM permissions and service account is given RBAC based on the documentation.
I documented the steps in my blog with screenshots at https://blogs.aspnet4you.com/2020/03/17/run-serverless-kubernetes-pods-using-amazon-eks-and-aws-fargate/ and you can see address of ingress is empty! Ingress PODs are running fine.
I could create an alb manually which is what I did but it defeats the purpose. Any idea why alb didn't get created?
Thanks, Prodip