Open multimac opened 4 months ago
Welcome @multimac!
It looks like this is your first PR to kubernetes-sigs/aws-efs-csi-driver 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.
You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.
You can also check if kubernetes-sigs/aws-efs-csi-driver has its own contribution guidelines.
You may want to refer to our testing guide if you run into trouble with your tests not passing.
If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!
Thank you, and welcome to Kubernetes. :smiley:
Hi @multimac. Thanks for your PR.
I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test
on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.
Once the patch is verified, the new status will be reflected by the ok-to-test
label.
I understand the commands that are listed here.
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: multimac Once this PR has been reviewed and has the lgtm label, please assign justinsb for approval. For more information see the Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
PR needs rebase.
Is this a bug fix or adding new feature? New feature
What is this PR about? / Why do we need it? We noticed a problem in our clusters where EFS filesystems were being mounted across AZs. We managed to troubleshoot this back to the way that our DNS servers are set up, in that they are not guaranteed to exist in the same availability zone as the server making the DNS request. This can lead to the CSI driver resolving the wrong IP address for a mount target, as it will get the IP address of the mount target in the AZ that the DNS server receiving the request is in.
We tried several different approaches to solve the problem, like trying to ensure our DNS requests stayed in the same AZ as the server making the request (tricky as we wanted inter-AZ DNS requests for redundancy) and making use of the ability to pass an explicit AZ in the
StorageClass
(would have made our storage classes too restrictive)Ultimately, we found the easiest way would be to have the EFS CSI driver detect the availability zone the Kubernetes node was in and pass that information along when mounting the filesystem
What testing is done? We have been running a forked version of the EFS CSI driver that includes this change for about 4 weeks now. Since including the change we have seen all inter-AZ traffic related to EFS disappear