aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.66k stars 3.92k forks source link

ec2: allow adding subnet groups/AZs after initial VPC deployment #28644

Open rix0rrr opened 10 months ago

rix0rrr commented 10 months ago

Describe the feature

With our current Vpc construct, it's easy to get going. What's not obvious however is that once you deploy any machines into your VPC, it becomes impossible to change the layout, not even additively.

The reason for that is that because of the way default CIDR allocations are done, whenever any groups or AZs are added, the CIDRs are changed. However, changing a CIDR requires replacing the subnet, and that is not possible as long as any machines are attached to the Subnet. This means that changing the Vpc layout is a very disruptive operation that requires tearing down all infrastructure.

There are two decisions that cause the current behavior:

These problems are prominent in IPv4, where the available IP space is (comparatively) small and must be used efficiently. That's not to say they couldn't be lifted for IPv4 as well, but that's where the motivation for the current design comes from.

In IPv6-land though, IP space is effectively infinite, and we can do whatever.

Use Case

Schematically, this diagram shows the current problem and the proposed solution. The solution can be implemented both for IPv4 and IPv6, but should definitely be considered once IPv6-only VPCs become a thing.

In this use case, a customer has a VPC with 3 Subnet Groups spanning 3 AZs (a, b, c) and they want to add a 4th AZ (d). The same problem would occur in a slightly different shape when a new subnet group would be added instead. You can see the sizes of all subnets shifting when the change is made, necessitating a replacement that will be impossible in practice:

image

Proposed Solution

The proposed solution is:

Of course, all of these sizes should be configurable.

() The default VPC is created with a /16 CIDR, leaving /18 room per AZ (if we assume 4 AZs), leaving /20 (4094 machines) per subnet if we assume max 4 subnets/groups per AZ or /21 (2046 machines) if we assume max 8 subnets/groups per AZ. The downside would be that we would waste more than 70% of the available IP space in the default setup (`(2^119) / (2^16)` is effectively used). We could also do things like say that Public subnets by default have a smaller size than either Private or Isolated subnets.

Other Information

No response

Acknowledgements

CDK version used

-

Environment details (OS name and version, etc.)

-

rix0rrr commented 10 months ago

Duplicate: https://github.com/aws/aws-cdk/issues/28369

NetDevAutomate commented 9 months ago

VPC CIDR allocations and can’t be changed after deployment, however a secondary CIDR can be added for expansion.

If there is any potential requirement for subnets to be added later i.e. a new AZ, then it’s recommended to use an explicit subnet strategy with room for growth, ideally using a summarisable range.

nbaillie commented 9 months ago

A few areas of comment below:

Size Often a /24 may be for a normal app subnet when they have some EC2 or some Lambdas landed there etc, i guess this is partly conservative use of IPs and partly due to the easy math that it provides for working out the spaces, perhaps a bit of a hangover from traditional network building.

Growing use of container hosting accounts for EKS or other platforms often go bigger as the IPs are/can be allocated to the containers. so /21 (2046) IPs perhaps would be reasonable here.

/21 would be enough in most cases. but could be seen to be either wasteful or not enough, i can imagine that the masses would want to be able to supply a CIDR for the size or something like that which again moves away from predictable implementation or perhaps some T-Shirt sizes like small /28, medium /24, large /21or20.

Expandable vs Fixed Mostly when a VPC / Subnet is created there is a rough idea of what it will contain, Fixed in this case could work quite well, and the trade off on economy of use may be desirable to allow for additions if needed later.

economy of use In thinking about the idea of planning ahead for AZs as described above, i wonder if there could be some regional awareness built in such that the number of AZs with space reserved is allocated accordingly for example eu-west-2 only has 3AZs so perhaps as we can know this we just stick with reserving for 3, obviously this does raise the question of what if the number of AZs ever increase. Over all could we have the user decide what proportion of the space they want to reserve for expansion, and then use a fixed pattern over the top of that.