aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.22k stars 320 forks source link

[EKS] [request]: Enable Worker Orchestration in separate VPC to Cluster #153

Open tomhaynes opened 5 years ago

tomhaynes commented 5 years ago

Tell us about your request Currently it does not seem possibly to orchestrate workers in different VPCs to the EKS control plane.

Which service(s) is this request for? EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? We currently isolate development environments in separate VPCs, and are looking to adopt EKS. The cleanest pattern for us would be to stand up one EKS cluster that can embed workers into each of the development environment VPCs.

Are you currently working around this issue? Currently we would have to stand up separate EKS clusters per environment.

orkaa commented 5 years ago

@tomhaynes I'm thinking about a similar thing. Can you just confirm you also tried VPC peering and it still didn't work? Thanks!

tomhaynes commented 5 years ago

Hi @orkaa - yeah our VPCs are peered. I found that worker nodes were able to join fine, and EKS is able to schedule pods to them.

I realised the issue when I found that kubectl logs times out on these pods - it currently seems the functionality is half there at the mo

orkaa commented 5 years ago

Makes sense. Thanks for the confirmation @tomhaynes 👍

christopherhein commented 5 years ago

I realised the issue when I found that kubectl logs times out on these pods - it currently seems the functionality is half there at the mo

@tomhaynes & @orkaa this makes sense, the reason for this is the control plane adds a cross account ENI into your VPC during the provisioning process which logs, exec, port-forward use.

In your dev environment do you need to have your pods isolated at the VPC level? We have a way of extending VPC CIDRs using the AWS VPC CNI which would allow you to allocate not RFC1918 ranges which that can have their own security groups and you might be able to treat as isolated environments - https://docs.aws.amazon.com/eks/latest/userguide/cni-custom-network.html

Interested in hearing your thoughts…

tomhaynes commented 5 years ago

@christopherhein so a potential way to add this functionality could be the ability to add additional ENIs into other VPCs?

Our current environment pattern is to have one infra VPC that contains support / orchestration functionality. This is peered to multiple app VPCs that contain separate development environments.

Ideally we would plug EKS into this setup by running the EKS cluster in the infra VPC, and embedding worker nodes into each app VPC.

lra commented 5 years ago

This would solve another issue: When EKS is not available in a region, we could launch EKS in a supported region and launch the worker nodes in another region, using a peered VPC.

This would give your customers stuck in us-west-1 an option to use EKS ;)

unused1 commented 5 years ago

this will help us adopt a good practice of having the api servers in a central management vpc, separating the cluster endpoints from the worker nodes vpc/ account where apps are running in.

yoda commented 4 years ago

Would like this functionality in order to be able to have a cluster spanning multiple regions where pod affinities could be used to specify regions etc. Also from a cost / overhead basis having to maintain multiple eks clusters for little benefit.

cromega commented 4 years ago

@yoda Sorry for bumping an old issue, have you managed to set up a multi-region cluster in the end? I'm trying to do the same. I have a node trying to join from a peered VPC from another region but the control plane rejects all requests with Unauthorized even though I'm using the same role as other nodes that work fine. Looks like kubectl tokens generated in another region are somehow not valid.

diegoduarte-picpay commented 4 years ago

Hello there!

Any progress in this issue/request? I stumbled upon a situation where this'd fit very well in our PCI environment, and I'd love to see this working.

nmarus commented 4 years ago

Looking for some similar functionality for PCI environment without having to spin up 2 eks clusters. Ideally, we have a single EKS control plane and then have node groups with each individually assigned to subnets in their own VPC.

For Example:

raoofm commented 3 years ago

Is this feature being implemented? It would help if there is an indication that the proposal is accepted or a tentative release date.

ForbiddenEra commented 1 year ago

Bump and +1? I was just about to attempt this but now I'm not sure it'd even be possible, though this could be useful functionality.

We have a main infra [VPC A] that has the EKS cluster with a eks-managed node group running any core stuff like plugins and a few other node groups running core infra services. We also have other apps VPCs [VPC B, etc] that have transit gateway/peering connectivity to the infra VPC for GWLB + using it's instances for egress internet traffic.

Thus it would be awesome to eliminate the secondary EKS cluster/control plane in the apps VPCs and just deploy appropriate node groups into the apps VPCs instead. This could potentially eliminate some redundant (not the good kind of redundant 😉) stuff if one doesn't need multiple control planes + in my case, the eks-managed node group running the plugins etc.

Although assuming it would be possible, I do wonder about using the custom-networking CNI and putting pods on a separate subnet from the nodes.. at least you can potentially specify separate subnets for the separate nodegroups..

Perhaps if this functionality was implemented that at least the custom networking VPC-CNI plugin could be setup in a way where you could either run multiple instances of it and have each instance only act on pods w/certain tags/taints or allow a single instance to have separate eni_config's for each nodegroup again by tags/taints or something.

I've read one can join external clusters to EKS but IIRC what I read was that they're basically read-only/monitoring kind of deal.

Perhaps, as each EKS cluster even without any nodes or pods has a not quite trivial cost, AWS doesn't want to implement this as to force you to run an EKS control plane in each VPC for that sweet sweet cash 💰? I'd really like to think that wouldn't be the blocker on this ability/functionality - but it's been over 4 years since this issue was opened without even a chirp from anyone at AWS?

Unless someone's figured out an existing method/workaround to makeithappen?

ssdrosos commented 1 year ago

I just found this issue, after attempting something similar myself as a test setup. I wanted to see If can setup the EKS control plane in 1 region e.g. eu-south-1 and then setup a worker group in eu-north-1. Both regions were connected with a VPC Peering connection.

What I found after fiddling around with the eks bootstrap.sh and the EKS control plane logs is that you CAN setup appropriate user-data on the worker nodes on the different region but then you reach the EKS implementation wall. Meaning, EKS gives access to nodes via the aws-auth configmap which is "partially" customizable like the following example config map entry:

arn:aws:iam::{ACCOUNT_ID}:role/eksctl-be-sandbox-eks01-nodegroup-NodeInstanceRole-1MMQM1ZUUKZSG_    system:node:{{EC2PrivateDNSName}}   system:bootstrappers,system:nodes   

based on that the EKS control plan will try to look up the connecting worker's private DNS Name so it can finish up the pairing and let the node join the cluster. However, that lookup can only happen in the VPC/region that the control plane is deployed to.

You can potentially work your way around this problem with some more bash+terraform automation and basically add new aws-auth config map entry manually each time a new node tries to join the cluster so that you have the {{EC2PrivateDNSName}} with a real value, so you could have something like the following:

arn:aws:iam::{AWS_ACCOUNT_ID}:role/eksctl-be-sandbox-eks01-nodegroup-NodeInstanceRole-1MMQM1ZUUKZSG system:node:ip-10-138-58-178.eu-north-1.compute.internal    system:bootstrappers,system:nodes   

What this will do is make the EKS join workflow move a bit further, the EKS control plane knows the DNS name of the joining node however it will still need to look up the instance id of the joining. This operation will still fail, because again, EKS has not been implemented to support a cross-region setup at its current state.

So long story short: It is not possible to make worker nodes join an EKS control plane from another VPC or region. I got the same response from the AWS Support:

With regards to your query, note that EKS cluster and node groups are currently limited to the same VPC, where the nodes must be in the same VPC as the EKS cluster:
[https://docs.aws.amazon.com/eks/latest/userguide/eks-compute.html ](https://docs.aws.amazon.com/eks/latest/userguide/eks-compute.html)

    Nodes must be in the same VPC as the subnets you selected when you created the cluster. However, the nodes don't have to be in the same subnets.

Hopefully, there will be enough demand at some point that they will implement this functionality. I too am at a place where I have to manage 5 different EKS control planes and such a feature will take away alot of the pain around the EKS upgrades