Open ArbitraryCritter opened 2 years ago
@RasmusWernerLarsen I created an issue to migrate library we use. I am not sure how much work this refactoring would cost. Would you like to work on it, such that in the next step you can add the IPv6 TG feature?
I wonder if we need a flag or if we can detect it within our code if we have IPv4/6 addresses and create the right CF stack. What do you think?
@szuecs - I'll take a look, as you said it's hard to estimate how long it would take to migrate to goformation, but I'll see if I can find the time. I would love to help out, I really like the design of this ingress controller and I've been using it for a few years now.
As for detecting it, I think that might eventually be the right way, but for now I think it's better to have the user decide. I see too many backwards compatibility breakages for people with IPv6 enabled, but without security rules for IPv6.
Also, the decision of which IP address to use has to be made when we create the target group, as the ip address type can only be set at creation (it's immutable), so you have to make that decision when you create the target group, which is create together with the ingress right now. From what little I've gleamed from the internals of the code, without some significant refactor, we won't know what IP address to set until we scan for the pods, which is done later in the process. We could ofcourse just create two target groups, one for IPv4 and one for IPv6, but that seems unnecessary to me.
Also, I believe everyone doing IPv6 only deployments on AWS right now are probably quite aware of the potential challenges, so having to set a flag will probably not be a significant barrier, whereas the backwards compatibility handling for unaware IPv6 users could be really unpleasant.
@RasmusWernerLarsen makes sense to me, so let's do the flag.
I've been working a bit with setting AWS EKS up in IPv6 only mode with the kube-ingress-aws-controller, combined with the new "--target-access-mode=AWSCNI" mode.
When I say IPv6 only deployment mode, that is not completely true, the nodes have a single IPv4 address but all internal services only have IPv6 and all external IPv4 traffic out of the cluster is NAT'ed to the node ip. Anyway, the problem with the controller is that there is currently an assumption that all target groups created will only get IPv4 targets, so I end up with an error like this.
level=error msg="unable to register instances [\"2a05:d018:1436:ff06:d7ad::a\" \"2a05:d018:1436:ff08:6caa::4\" \"2a05:d018:1436:ff08:6caa::5\"] in target group arn:aws:elasticloadbalancing:eu-west-1:411286365248:targetgroup/kube-ing-TG-8T26JSGSIFFR/06cd49c411a60939: ValidationError: The IP address '2a05:d018:1436:ff06:d7ad::a' is not a valid IPv4 address\n\tstatus code: 400, request id: 0a50cafd-8b94-40da-b6ce-c116dff991cd"
So everything works and this only fails at the AWS API invocation to register the endpoint, because the targetGroup needs to have the "ipAddressType" set to ipv6. (See: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-elasticloadbalancingv2-targetgroup.html#cfn-elasticloadbalancingv2-targetgroup-ipaddresstype)
Now I actually did checkout the code and took a look at how to implement this. My approach for a solution was to add a --target-group-ip-address-type flag, that would default to "IPV4" but could be overridden to "IPV6" globally. Unfortunately I ran into the problem that the unmaintained "github.com/mweagle/go-cloudformation" that the controller uses, hasn't updated the resource definitions for the targetGroups. It was easy to work around by modifying it manually, but I'm guessing that's not a solution here.
Aside from that, the changes needed seem trivial and low-risk.