lyft / cni-ipvlan-vpc-k8s

AWS VPC Kubernetes CNI driver using IPvlan
Apache License 2.0
360 stars 58 forks source link

sneaky swallowed error #40

Closed chris-h-phillips closed 6 years ago

chris-h-phillips commented 6 years ago

After trying #37 out on a busier cluster I noticed that the problems we had with missing vpc routes still happened. I looked closer at the metadata.go code and it appears that we were still swallowing errors from the metadata service. These errors would have caused retries in the ipam plugin but they weren't bubbled up. I did some testing to cause mass rescheduling of pods onto new nodes that didn't have any extra ENIs attached. The idea was to cause as many instances as possible of the first pod or two that gets mapped to an ENI and observe the unstable behavior of the instance metadata service during this initialization time. The plugin logged many 404 errors right after the ENIs get attached but the errors got surfaced to the right place and after a few retries to the metadata service, the pod network always got set up correctly.