Open spr-mweber3 opened 1 year ago
@spr-mweber3 Can you share the ingress resources that triggers a CF stack where this happens?
I can't find we set this permission anywhere in our production system and we didn't see this problem you describe. I wonder if you use some of the more special annotations from the controller?
There is nothing special in regards to annotations on that ingresses. In fact, we're applying the same ingresses with the same annotations in new clusters.
We're using the same v0.14.24 of the controller everywhere.
The additional permission doesn't seem to be needed for already deployed load balancers. We have a lot of them managed by the controller and don't see any issue but if you try to create a new one, you'll be able to see that something changed.
Even more, somehow the naming schema of the created load balancers changed. Earlier, the load balancers were named kube-ing-LB-Z0LLWRJNY0IS
but new ones look diifferent, e.g. LB-9MiEZ2OqtiIy
. I think you'll spot the differences. Lower case characters, kube-ing
missing at the start.
I tried to figure out what changed, but didn't succeed. To me it looks like it's something at AWS on their API, which maybe causes different behavior inside the controller as a consequence.
What? Are you sure that you run the same image? Do you have a change in your AWS iam?
Yep, it's true. I'm not kidding. It's the same image. We're using v0.14.24
everywhere. I just tried again to force create a new CF stack with zalando.org/aws-load-balancer-shared: "false"
on a new ingress. It succeeds only if the missing permission is added to the IAM role and the name of the LB is not consistent with what it was before.
I mean, did you try it yourself as well?
I'm not aware of any changes on our side which would be able to cause that. We're provisioning and deprovision identical setups and it just started recently that we monitor the issues described here.
Can that be somehow related to changes at AWS to support SGs on NLBs? Or changes in the API at CloudFormation? Because in fact it's not the controller itself who creates the resources, it's the CloudFormation stack which now seems to do stuff differently than before.
@spr-mweber3 can you share the ingress you use, then we can try with the same (ofc. hide your internal details like hostnames and so).
Sure thing.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
zalando.org/aws-load-balancer-shared: "false"
name: a-name
namespace: default
spec:
rules:
- host: a-name.foo.bar
http:
paths:
- backend:
service:
name: a-service
port:
number: 8080
path: /
pathType: Prefix
Ok, that is very basic
@spr-mweber3 If you diff the CF stack of a "good" cluster and the "bad" cluster, do they differ? I guess we need to ask AWS support, because I have no idea how this can happen. Maybe it's something new they internally do for new CF stacks?
I tried it with our controller and version v0.14.30 and everything works fine without the permission. v0.14.24 has a different version of aws SDK , but I don't see anything in https://github.com/aws/aws-sdk-go/compare/v1.44.273...v1.44.294 because the information is not much they provide in the listing. Maybe try the latest version?
Hey,
referring to the documented pre-requirements regarding the IAM policy. It seems there was a change somewhen somewhere. I was not able to find out where exactly. Matter of fact is, we didn't do any update to the controller recently. So I think that can be ruled out.
It now seems to be required that in the policy for the controller
elasticloadbalancing:DescribeLoadBalancerAttributes
permission is required.Otherwise the controller will not be able to succeed anymore with provisioning the resources through CloudFormation.