AndrewGuenther / cdk-fck-nat

CDK constructs for the fck-nat service
MIT License
65 stars 10 forks source link

DualStack VPC Incompatibility (Small Issue, Potentially Easily Fixable) #344

Open neoakris opened 2 months ago

neoakris commented 2 months ago

I noticed that when I make a vpc with

ipProtocol: ec2.IpProtocol.IPV4_ONLY,
natGatewayProvider: new FckNatInstanceProvider( props );

The fck-NAT EC2 instances that are part of an ASG automatically get public IP addresses (things work normally / as expected.)

However, if I make a vpc with

ipProtocol: ec2.IpProtocol.DUAL_STACK,
natGatewayProvider: new FckNatInstanceProvider( props );

The fck-NAT EC2 instances that are part of an ASG don't get automatically assigned a public IP address. (hard requirement for it to work correctly)

Is there configuration that can be added to fck-NAT's ASG https://github.com/AndrewGuenther/cdk-fck-nat/blob/main/src/index.ts#L248 to force it to auto assign public IP addresses?

AndrewGuenther commented 2 months ago

ACK. I'll make a test case for this and check it out. Just a heads up however that I've been swamped with my day job and progress on issues here has been a bit slow. Given that this appears to be a bug however it will get priority.

neoakris commented 2 months ago

Thank you, especially for this project in general, and coming up with an awesome project name and humorous + useful docs.

By the way no need rush at all, It'll help for a project I'm working on but nothing urgent, I'd be fine to temporarily use an ipv4 vpc temporarily for the next 4 weeks.

I have 3 ideas of workarounds to explore next week, I'll try to report back with results of tinkering by around Sept 23rd. I can also attach a short self contained example here while I'm at it, that might save a few minutes in terms of coming up with a test case.

neoakris commented 2 months ago

This was able to work as an SSCE(short self-contained example) of how to reproduce the issue: cdk-main.ts

import * as cdk from 'aws-cdk-lib';
import * as ec2 from 'aws-cdk-lib/aws-ec2';
import { FckNatInstanceProvider, FckNatInstanceProps } from 'cdk-fck-nat'

const reference_to_cdk_bootstrapped_cf_for_storing_state_of_stacks = new cdk.App();

const test_stack = new cdk.Stack(
  reference_to_cdk_bootstrapped_cf_for_storing_state_of_stacks, 
  "vpc-test-stack",
  {
    env: {
        account: process.env.CDK_DEFAULT_ACCOUNT!,
        region: "ca-central-1" //Montreal, Canada (low co2)
    }
  }
);

let fckNat = new FckNatInstanceProvider({
  instanceType: ec2.InstanceType.of(ec2.InstanceClass.T4G, ec2.InstanceSize.MICRO)
});

const test_vpc = new ec2.Vpc(test_stack, "test-vpc", {
  natGateways: 1,
  ipProtocol: ec2.IpProtocol.DUAL_STACK,
  natGatewayProvider: fckNat,
});

1st workable solution:
update test_vpc's subnet to configuration to specify mapPublicIPOnLaunch: true,
Then terminate the nat so the replacement gets a public IP.

Why I don't like this solution:
I'm not sure if this is best, because I could imagine some people might want to launch ec2 instances in the public subnet that would have ipv6 public ip, and not a ipv4 public ip. (I know public ipv6 isn't a real distinction, it's public reachability depends on the subnet it's in, but I think you get my intended meaning.) I theorize that may be why they changed the default behavior for dualstack vpcs, and this config based solution makes that usecase slightly harder/require specifying config to explicitly not attach public IPv4's.

const test_vpc = new ec2.Vpc(test_stack, "test-vpc", {
  natGateways: 1,
  ipProtocol: ec2.IpProtocol.DUAL_STACK,
  natGatewayProvider: fckNaT,
  subnetConfiguration: [
    { 
        name: "Public", 
        subnetType: ec2.SubnetType.PUBLIC,
        mapPublicIpOnLaunch: true, //(this is needed for fck-NAT to work)
    },
    { 
        name: "Private",
        subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS,
    }
  ],
});

2nd workable solution: (I think this is better, but it needs a code change)
After failing at the following
class ModifiedFckNatInstanceProvider extends FckNatInstanceProvider{
due to errors related to incorrectly extending a class with private parameters.

I forked fckNat, (not a true fork but the equivalent of a fork), to allow modification of the configureNat() function, to allow modification of how the asg got constructed.
https://github.com/AndrewGuenther/cdk-fck-nat/blob/main/src/index.ts#L248

ModifiedFckNat.ts, here's the relevant line change I made,
add this single line to lanchTemplate:
associatePublicIpAddress: true,

...
        const autoScalingGroup = new autoscaling.AutoScalingGroup(
          sub, 'FckNatAsg', {
            vpc: options.vpc,
            vpcSubnets: { subnets: [sub] },
            minCapacity: 1,
            maxCapacity: 1,
            groupMetrics: [autoscaling.GroupMetrics.all()],
            launchTemplate: new ec2.LaunchTemplate(sub, 'FckNatLaunchTemplate', {
              associatePublicIpAddress: true, //<-- dualstack vpc's need this
              instanceType: this.props.instanceType,
              machineImage,
              securityGroup: this._securityGroup,
              role: this._role,
              userData: userData,
              ... (this.props.keyName && { keyName: this.props.keyName }),
              keyPair: this.props.keyPair,
            }),
          },
        );
...

Then this became a working solution.

import { ModifiedFckNatInstanceProvider } from '../lib/fork/ModifiedFckNat';
/*...*/

let fckNat = new ModifiedFckNatInstanceProvider({
  instanceType: ec2.InstanceType.of(ec2.InstanceClass.T4G, ec2.InstanceSize.MICRO)
});
const test_vpc = new ec2.Vpc(test_stack, "test-vpc", {
  natGateways: 1,
  ipProtocol: ec2.IpProtocol.DUAL_STACK,
  natGatewayProvider: fckNat,
});

3rd (potentially) workable solution: pass pre-provisioned / reserved EIPs to FckNat

I couldn't quickly get this to work. Rather than take the time to figure it out/debug it, I stopped bothering based on the 2nd solution seeming to work well enough.

let eip1 = new ec2.CfnEIP(test_stack, "fckNat-eip1");
let eip2 = new ec2.CfnEIP(test_stack, "fckNat-eip2");
let eip3 = new ec2.CfnEIP(test_stack, "fckNat-eip3");

let fckNat = new FckNatInstanceProvider({
  instanceType: ec2.InstanceType.of(ec2.InstanceClass.T4G, ec2.InstanceSize.MICRO),
  eipPool: [eip1.attrAllocationId, eip2.attrAllocationId, eip3.attrAllocationId],
});

const test_vpc = new ec2.Vpc(test_stack, "test-vpc", {
  natGateways: 1,
  ipProtocol: ec2.IpProtocol.DUAL_STACK,
  natGatewayProvider: fckNat,
});
test_vpc.node.addDependency(eip1);
test_vpc.node.addDependency(eip2);
test_vpc.node.addDependency(eip3);

I think the 2nd is clean enough to warrant implementation.

I'd be happy to make 1 line change PR if you're interested?