RaJiska / terraform-aws-fck-nat

Terraform module for fck-nat
https://registry.terraform.io/modules/RaJiska/fck-nat/aws/latest
MIT License
71 stars 37 forks source link

Attaching the same elastic IP not working #5

Closed hrist0stoichev closed 11 months ago

hrist0stoichev commented 11 months ago

Hello and thank you for developing this module! I'm trying to pass eip_allocation_ids so that the NAT instance always has the same public IP but it's not working. I don't see the elastic IPs being associated with the EC2 instances. Here's an example configuration:

resource "aws_eip" "nat" {
  count = length(var.external_subnets)
  vpc   = true
}

module "fck-nat" {
  count = length(var.external_subnets)

  source  = "RaJiska/fck-nat/aws"
  version = "1.2.0"

  name                 = "fck-nat-${count.index}"
  vpc_id               = aws_vpc.main.id
  subnet_id            = element(aws_subnet.external.*.id, count.index)

  ami_id               = "ami-00d653f185e930c04"
  instance_type        = "t4g.nano"
  use_spot_instances   = false
  use_cloudwatch_agent = true
  encryption           = true
  update_route_table   = true
  route_table_id       = element(aws_route_table.internal.*.id, count.index)
  ha_mode              = true
  eip_allocation_ids   = [element(aws_eip.nat.*.id, count.index)]
}

I see that the allocation_id is passed in the user data as eip_id=... but for some reason it is not associated with the EC2 as it should be happening here. I tried viewing logs from inside the instance with aws ec2 get-console-output --instance-id ... but didn't have much success as no logs were shown. I'm wondering whether I need to change my configuration to something like

resource "aws_eip" "nat" {
  count = length(var.external_subnets)
  vpc   = true
  network_interface = element(fck-nat.*.eni_id, count.index)
}

which would associate the Elastic IP with the Network Interface but then I'm not sure why this call should be made at all. As a matter of fact, I'm also wondering about those calls as well since the network interface is already attached to the EC2. Maybe there's no need to pass eni_id and eip_id if the infrastructure is already configured (eip and eni attached)? That would also remove the need for the IAM permissions.

Another eip related thing I'm wondering about is this one. Do you think that if eip_allocation_ids is provided there's no need to create an ephemeral public network interface since there's already an elastic IP that will be associated with this instance?

RaJiska commented 11 months ago

Hi @hrist0stoichev ,

Which AMI are you using for this ? Please beware that as stated in the README file, some of the features provided by this terraform module are not fully released yet in the fck-nat project. This means that to be able to use them, you'd need to build the fck-nat image yourself from the latest commit. Same goes for the Cloudwatch Agent (please make sure you are aware of costs, CW agent is likely to cost you several times the cost of the instance itself).

I'm not sure why this call should be made at all. As a matter of fact, I'm also wondering about those calls as well since the network interface is already attached to the EC2.

Another eip related thing I'm wondering about is this one. Do you think that if eip_allocation_ids is provided there's no need to create an ephemeral public network interface since there's already an elastic IP that will be associated with this instance?

The ephemeral public network interface is used to communicate with the internet and then back to the private network, while the second ENI is a floating one meant to be re-associated from instance to instance allowing the HA-mode to work seamlessly without having to alter the route table when the NAT instance is replaced. When an instance is replaced, it will self-assign the floating ENI, and ensure that the ephemeral public network interface is configured to work with NAT by disabling IP source checks.

Now as to why an ephemeral public IP is assigned to the ephemeral network interface, that is because the instance is self-assigning the EIP and requires public networking to contact AWS API in order assign itself the EIP address.

In the future it would be possible to have an option disabling the ephemeral public IP provided the subnet the instance is launched is PrivateLink-connected to the relevant AWS API. This hasn't been a focus so far as such setup tend to be rare, expensive, and perhaps overkill.

Hope that it clarifies

hrist0stoichev commented 11 months ago

Which AMI are you using for this ? Please beware that as stated in the README file, some of the features provided by this terraform module are not fully released yet in the fck-nat project. This means that to be able to use them, you'd need to build the fck-nat image yourself from the latest commit.

I was using the lates publicly available AMI but after your comment I decided to build one myself and now the eip assignment seems to be working just fine. It's pretty annoying that the latest publicly available AMI is based on a version that lacks functionalities clearly described in the documentation without anything being mentioned about that. I am actually glad that I decided to open the issue, otherwise if you hadn't told me about this I might have still be trying to look for the problem elsewhere. So, thank you!

Here's some help for the people trying to build their own AMI on MacOS as, again, it's not clearly described in the documentation:

# Only set those 2 variables, everything else can be copy pasted
V=1.2.3-whatever        # the version to build
R=eu-west-2             # where to push the AMI to

git clone https://github.com/AndrewGuenther/fck-nat.git
cd fck-nat

cat << EOF > packer/my-vars.pkrvars.hcl
architecture = "arm64"
version = $V
region = $R
EOF

brew install packer
packer init ./packer/fck-nat.pkr.hcl

docker run -w /src -v $(pwd):/src kong/fpm make VERSION=$V package
packer build -var-file="./packer/my-vars.pkrvars.hcl" ./packer/fck-nat.pkr.hcl
RaJiska commented 11 months ago

Indeed I had expected those features to be published as part of a new version earlier. Considering a few of my PRs haven't been reviewed for a while I believe the owner might have other priorities at the moment. I do agree with you though, having a marker in the documentation indicating which version a feature is made available at which version would be good to have.

I updated the documentation of this project to make it more visible as to which features may not be available in the latest published fck-nat version.

hrist0stoichev commented 11 months ago

Thank you! 🙇

SKeeneCode commented 11 months ago

Would anyone be able to suggest what may be wrong as I am having the same problem.

I built my own AMI image and am trying to use it:

module "fck-nat" {
  source = "./TerraformModules/terraform-aws-fck-nat-main"

  name      = "fck-nat"
  vpc_id    = aws_vpc.my_vpc.id
  subnet_id = aws_subnet.public_subnet_a.id
  ha_mode = true
  ami_id = "ami-0a648f49318daa3a9" // built from [https://github.com/AndrewGuenther/fck-nat.git
  eip_allocation_ids = [aws_eip.fck-nat.id]

  update_route_tables = true
  route_tables_ids = {
    "private" = aws_route_table.private_route_table.id
  }
}

Interestingly with my own build image there does not seem to be the static internal facing interface, only the public one.

But the elastic IP still does not associate with the public instance/interface.

Am I missing something obvious?

RaJiska commented 11 months ago

Hi @SKeeneCode ,

I'd need some logs to understand what's happening. You can retrieve service logs by SSHing into the instance and running journalctl -u fck-nat.

As initially I expected my SSM PR to be approved quickly I didn't add a SSH key parameter to the Terraform module. You may want to manually create a new version of the launch template with your SSH key, manually update the ASG, and then log into the instance.

RaJiska commented 11 months ago

Hey @hrist0stoichev ,

Follow up regarding your earlier concern with documentation including unreleased features. Your suggestion has been brought up and the owner of the project has updated to include versioned documentation. This should make it less confusing for people working with this project in the future.

hrist0stoichev commented 11 months ago

That's great news! I hope less people have troubles running fck-nat this way!