Two ENIs? - Githubissues

madeupname commented 1 year ago

Thanks for this project, I'm excited to get it to work. My goal is Internet access for a Lambda on a private subnet. I will share all my config and troubleshooting below, but the big thing I notice in contrast to the docs is that my fck-nat EC2 instance has two network interfaces. Is that normal?

The first is FckNatInterface created by the CFN stack. The second appears to have been created automatically when the instance was launched by FckNatAsgLaunchConfig. They have all the same settings (other than private IPv4 address), including security group (NatSecurityGroup) and subnet (public), but FckNatInterface is missing a public IP address. The other one has it.

I note the FckNatInterface is attached by the service script, and given the Enable auto-assign public IPv4 address setting on the public subnet, my guess is that AWS is creating an ENI with a public IP and attaching it before startup.

If I'm reading /opt/fck-nat/fck-nat.sh correctly, a possible solution is:

remove FckNatInterface from the CFN template
remove echo "eni_id=${FckNatInterface}" >> /etc/fck-nat.conf from FckNatAsgLaunchConfig UserData
terminate instance

At that point, the ASG should launch a new instance, which if it follows previous behavior, will attach an ENI with a public IP, which the service script will use. But the bigger question is why is this happening if it's not what you intended? And, of course, will it solve the problem? I confess it's been a while since I've worked professionally as a UNIX/Linux admin.

I should add that I never had a NAT gateway, as this is a project in development, but I assumed I could start with fck-nat directly. If there is some magic when adding a NAT gateway first, I can do that.

Thanks!

Troubleshooting notes:

I'm using the CFN template with minor tweaks, like adding a key pair and allowing ssh. It's part of my AWS SAM template.yaml file.
I manually verified all created resources using your manual/web console instructions as a guide. All correct except second ENI.
I can ssh to the instance and per your test curl ifconfig.me gives the public IP of the instance.
I've run the reachability analyzer using the ENI of the Lambda (private subnet) as the source and it can get to both FckNatInterface and the instance itself. Nothing blocks it.
CIDR parameter is that of the VPC, not subnet.
subnet parameter is public subnet.
I've tried both of the instance's ENIs as the destination in the private route table, but neither work.

madeupname commented 1 year ago

I decided to test my theory above. I made the changes and refreshed the instance in the ASG. I got an instance with a single ENI with a public IP, in the correct security group and subnet, but I did need to manually disable the source/destination check. Updated the private subnet route table to point to the new instance.

Here's the thing - it didn't work. I spent an inordinate amount of time troubleshooting and rechecking everything.

The fix? I made a small change to my Lambda function to perform DNS resolution in order to test that. It appears the act of redeploying the function... jiggered something? And it just started working. I even verified there were no other resource changes, it was all the same. Given similar comments in my search, the whole thing (AWS/Lambda) seems a bit flaky.

That said, I still don't know why/how I was getting 2 ENIs. Curious to know.

AndrewGuenther commented 1 year ago

There are two ENIs because you have one for the internal IP address, where the traffic from your VPC goes and then the external IP address where your requests then go out to the internet from.

The behavior you saw is likely because you changed your route table to point to your new NAT instance, but resources still had the old route cached. This is why having that static internal IP is so important, fck-nat ensures by using a consistent ENI for the internal IP address that once your default route is configured it doesn't need to be updated, even if you bring up a new NAT instance.

AndrewGuenther / fck-nat

Two ENIs? #36