aws / amazon-vpc-cni-k8s

Networking plugin repository for pod networking in Kubernetes using Elastic Network Interfaces on AWS
Apache License 2.0
2.29k stars 742 forks source link

Fix issues with handling unmanaged ENIs with IPv6 only #3122

Open gavinbunney opened 1 week ago

gavinbunney commented 1 week ago

What type of PR is this? bug

Which issue does this PR fix?:

We have unmanaged trunk ENIs attached to our worker nodes which are designed for ipv6 only networks. These ENIs are tagged with node.k8s.amazonaws.com/no_manage, however the listing of attached ENIs happens before the IPAMD process filters those ENIs. As such, the metadata retrieval process fails when looking up the ipv4 address details for these ENIs, and causes the aws-k8s-agent process to exit with an initialization failure.

In this log, eni-084b51xxxxxx / 0e:7c:f9:xx:xx:xx is the aws-vpc-cni managed ENI, and eni-0b2cf8b6xxxxxx / 0e:f3:73:xx:xx:xx is our managed eni:

{"level":"debug","ts":"2024-11-22T16:49:26.314Z","caller":"awsutils/awsutils.go:1317","msg":"Total number of interfaces found: 2 "}
{"level":"debug","ts":"2024-11-22T16:49:26.314Z","caller":"awsutils/awsutils.go:567","msg":"Found ENI MAC address: 0e:7c:f9:xx:xx:xx"}
{"level":"debug","ts":"2024-11-22T16:49:26.316Z","caller":"awsutils/awsutils.go:567","msg":"Found ENI: eni-084b51xxxxxx, MAC 0e:7c:f9:xx:xx:xx, device 0"}
{"level":"debug","ts":"2024-11-22T16:49:26.318Z","caller":"awsutils/awsutils.go:567","msg":"Found IPv6 addresses associated with interface. This is not efa-only interface"}
{"level":"debug","ts":"2024-11-22T16:49:26.322Z","caller":"awsutils/awsutils.go:567","msg":"Found ENI MAC address: 0e:f3:73:xx:xx:xx"}
{"level":"debug","ts":"2024-11-22T16:49:26.324Z","caller":"awsutils/awsutils.go:567","msg":"Found ENI: eni-0b2cf8b6xxxxxx, MAC 0e:f3:73:xx:xx:xx, device 1"}
{"level":"debug","ts":"2024-11-22T16:49:26.326Z","caller":"awsutils/awsutils.go:567","msg":"Found IPv6 addresses associated with interface. This is not efa-only interface"}
{"level":"warn","ts":"2024-11-22T16:49:26.327Z","caller":"awsutils/imds.go:376","msg":"failed to retrieve network/interfaces/macs/0e:f3:73:xx:xx:xx/subnet-ipv4-cidr-block from instance metadata EC2MetadataError: failed to make EC2Metadata request\n<?xml version=\"1.0\" encoding=\"iso-8859-1\"?>\n<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\"\n\t\t \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\n<html xmlns=\"http://www.w3.org/1999/xhtml\" xml:lang=\"en\" lang=\"en\">\n <head>\n  <title>404 - Not Found</title>\n </head>\n <body>\n  <h1>404 - Not Found</h1>\n </body>\n</html>\n\n\tstatus code: 404, request id: "}
{"level":"error","ts":"2024-11-22T16:49:26.327Z","caller":"aws-k8s-agent/main.go:42","msg":"Initialization failure: ipamd init: failed to retrieve attached ENIs info: DescribeAllENIs: failed to get local ENI metadata: get attached ENIs: failed to retrieve ENI metadata for ENI: 0e:f3:73:xx:xx:xx: EC2MetadataError: failed to make EC2Metadata request\n<?xml version=\"1.0\" encoding=\"iso-8859-1\"?>\n<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\"\n\t\t \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\n<html xmlns=\"http://www.w3.org/1999/xhtml\" xml:lang=\"en\" lang=\"en\">\n <head>\n  <title>404 - Not Found</title>\n </head>\n <body>\n  <h1>404 - Not Found</h1>\n </body>\n</html>\n\n\tstatus code: 404, request id: "}

What does this PR do / Why do we need it?:

This PR:

Testing done on this change:

Added unit tests to cover the new paths. Running in our EKS cluster the ENIs are now retrieved successfully (In this log, eni-084b51xxxxxx / 0e:7c:f9:xx:xx:xx is the aws-vpc-cni managed ENI, and eni-0b2cf8b6xxxxxx / 0e:f3:73:xx:xx:xx is our managed eni):

{"level":"debug","ts":"2024-11-22T17:45:13.815Z","caller":"awsutils/awsutils.go:1325","msg":"Total number of interfaces found: 2 "}
{"level":"debug","ts":"2024-11-22T17:45:13.815Z","caller":"awsutils/awsutils.go:567","msg":"Found ENI MAC address: 0e:7c:f9:8b:06:fd"}
{"level":"debug","ts":"2024-11-22T17:45:13.817Z","caller":"awsutils/awsutils.go:567","msg":"Found ENI: eni-084b51xxxxxx, MAC 0e:7c:f9:xx:xx:xx, device 0"}
{"level":"debug","ts":"2024-11-22T17:45:13.818Z","caller":"awsutils/awsutils.go:567","msg":"Found IPv6 addresses associated with interface. This is not efa-only interface"}
{"level":"debug","ts":"2024-11-22T17:45:13.819Z","caller":"awsutils/awsutils.go:567","msg":"Found ENI MAC address: 0e:f3:73:xx:xx:xx"}
{"level":"debug","ts":"2024-11-22T17:45:13.821Z","caller":"awsutils/awsutils.go:567","msg":"Found ENI: eni-0b2cf8b6xxxxxx, MAC 0e:f3:73:xx:xx:xx, device 1"}
{"level":"debug","ts":"2024-11-22T17:45:13.822Z","caller":"awsutils/awsutils.go:567","msg":"Found IPv6 addresses associated with interface. This is not efa-only interface"}
{"level":"info","ts":"2024-11-22T17:45:14.003Z","caller":"ipamd/ipamd.go:424","msg":"Got network card index 0 for ENI eni-084b51xxxxxx"}
{"level":"info","ts":"2024-11-22T17:45:14.004Z","caller":"ipamd/ipamd.go:424","msg":"eni-084b51xxxxxx is of type: interface"}
{"level":"info","ts":"2024-11-22T17:45:14.004Z","caller":"ipamd/ipamd.go:424","msg":"Got network card index 0 for ENI eni-0b2cf8b6xxxxxx"}
{"level":"info","ts":"2024-11-22T17:45:14.004Z","caller":"ipamd/ipamd.go:424","msg":"eni-0b2cf8b6xxxxxx is of type: trunk"}
{"level":"debug","ts":"2024-11-22T17:45:14.004Z","caller":"ipamd/ipamd.go:385","msg":"DescribeAllENIs success: ENIs: 2, tagged: 2"}

Will this PR introduce any new dependencies?: n/a

Will this break upgrades or downgrades? Has updating a running cluster been tested?: Tested with upgrading inplace without issues

Does this change require updates to the CNI daemonset config files to work?: n/a

Does this PR introduce any user-facing change?: n/a

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

gavinbunney commented 5 days ago

@orsenthil @jaydeokar Would you be able to take a look? We are attempting to work around issues with our primarily-ipv6 network so would be good to get this merged in

orsenthil commented 5 days ago

@gavinbunney - I will review this shortly. It is my radar.