ansible / ansible-modules-extras

Ansible extra modules - these modules ship with ansible
948 stars 1.46k forks source link

Add support for AWS Managed NAT Gateways #1401

Closed kabo closed 7 years ago

kabo commented 8 years ago
ISSUE TYPE

Feature Idea

COMPONENT NAME

new module

ANSIBLE VERSION

N/A

SUMMARY

AWS has released support for Managed NAT Gateways. It would be nice if one could set them up with Ansible.

Etherdaemon commented 8 years ago

I'm happy to take a stab at this if no-one else has already?

Etherdaemon commented 8 years ago

Some notes on some things I've noticed after taking an initial look.

Although the creation of the NATGateway requires a ClientToken to ensure idempotency, that token is not shown in any further describe calls on nat_gateways. Updates to the nat_gateway is also not supported which leaves either delete and recreate with new values, or return error back to user that the request is invalid as an existing NAT gateway with that ClientToken has already been specified.

I'm going with the latter at this stage, as with the NAT Gateways you have to provide an elastic IP, which on termination of the gateway by default does not release the EIP. Also deletion of nat gateways don't update the route tables by removing the routes, which means that if I was then to drop and recreate as the method of updating then the user would be left with a bunch of black holes in their route tables after every play run.

I'm also going to provide the option to release the EIP on termination of the NAT gateway as having a non-released EIP in AWS can get quite costly.

Haven't decided on the route table configuration yet and whether to remove from all route tables yet or not - at this stage I'm going to leave that as out of scope and the user has to use the route_tables module instead.

jonhadfield commented 8 years ago

I submitted PR https://github.com/ansible/ansible-modules-extras/pull/1438 earlier today and then came across this thread and your submission. With regards to idempotency, I handle it a different way: Addition 1) Get all NAT Gateways in the specified subnet 2) Loop through NAT Gateways and check for matching specified allocation-id 3) If there's a matching NAT Gateway and its state is 'available' then exit with changed=False 4) During creation, check every 5 seconds for a failure (ignore previous failures by checking time) and succeed only once available. (timeout after 300 seconds but it generally takes less than 3 minutes so may drop it down a bit) Deletion If it exists, call delete and wait until status equals 'deleted' or it doesn't exist.

In keeping with one tool/module for a job, I personally wouldn't want to update routing tables and release EIPs as part of this module. I feel it would add a lot of complexity and duplicate the functionality of existing modules.

I'm busy testing my version against some real world examples, but will take a look at yours and feedback asap. Feel free to take anything from mine, if it helps, and I'll give attribution if I take any ideas/code from you.

Btw, I'm calling it like this:

- name: create VPC NAT Gateway for AZ A
  ec2_vpc_nat_gateway:
    state: present
    subnet_id: "{{ lookup('aws_subnet_ids_from_names', (aws_region, [nat_nodes_zone_a_subnet])) }}"
    allocation_id: "{{ lookup('aws_ec2_allocation_id_from_eip', ('eu-west-1', nat_nodes_a_eip) }}"

Where I'm writing lookup plugins and storing subnet names and eips in vars so that I can run in isolation of the tasks that get the EIP and create the subnets. Will upload plugins to https://github.com/jonhadfield/ansible-lookups when tested.

gergnz commented 8 years ago

@jonhadfield this line is wrong: https://github.com/Etherdaemon/ansible-modules-extras/blob/new_aws_nat_gateway_module/cloud/amazon/ec2_vpc_managed_ngw.py#L247

You've already grabbed the first item from Addresses dict a few lines up. It should read:

return allocation['AllocationId']
jonhadfield commented 8 years ago

@gergnz I think you meant to direct that at @Etherdaemon

Etherdaemon commented 8 years ago

@gergnz @jonhadfield Cheers - I've updated - thanks for picking that up

gergnz commented 8 years ago

Sorry @jonhadfield, @Etherdaemon. Thanks.

caseylucas commented 8 years ago

FYI, I added https://github.com/ansible/ansible-modules-extras/issues/1537 which relates to the new module being discussed here. For my use case, it's kind of pointless to be able to add a nat gateway if I cant then add it to routes.

Etherdaemon commented 8 years ago

Hi @caseylucas actually i believe routes do work. The message you are getting is because you cannot have 2 routes to the same destination from the same route table which is an AWS limitation. You'll need to remove your other route which I'm assuming is probably going to an existing nat/bastion host? ? Or you're accidentally trying to put the route in your external route table which already has a route via the Internet gateway and doesn't require a nat gateway to translate.

caseylucas commented 8 years ago

Hi @Etherdaemon. The route tables are brand new. (I start with no route tables in the fresh VPC.) I use new nat gateway module (Thanks to you and @jonhadfield ) to create a nat gateway, then use ec2_vpc_route_table passing the nat_gateway_id as gateway_id parameter. (There is no "nat_gateway_id" parameter.) The 1st time playbook runs and creates the route table correctly (with default route of nat gateway). If I run the playbook a second time, it fails. I believe the ec2_vpc_route_table module is unable to recognize that the nat gateway is already setup as a default route and attempts to add it again (which fails). Maybe I'm missing something else though.

gergnz commented 8 years ago

@caseylucas you are correct. The problem stems from the lookup. (https://github.com/ansible/ansible-modules-extras/blob/devel/cloud/amazon/ec2_vpc_route_table.py#L302). This is using boto, and boto has no understanding of nat gateways. When boto gets the list of routers, there is no route present with a destination of nat-gateway, thus the lookup fails (as it can't compare), and thus tries to re-add it.

I'm not sure AWS will be adding nat gateway support to boto, instead, we'll need to upgrade to boto3 for the ec2_vpc_route_table module.

To work around this, I did a horrible hack using command and the AWS CLI.

Etherdaemon commented 8 years ago

@caseylucas let me take a look at the route table module. I have an existing PR in when I was looking at the vpc endpoints routes - so I might be able to find a fix for it.

caseylucas commented 8 years ago

@Etherdaemon, Great, thanks. Does your fix include boto3 or maybe you were thinking boto3? I can help test if needed.

Etherdaemon commented 8 years ago

@caseylucas the original fix for the vpc endpoints does not include boto3 but I think this fix will as boto (v2) doesn't appear to have any knowledge of nat gateways but I'll verify that

wholroyd commented 8 years ago

Just submitted a PR to add NAT gateways in https://github.com/boto/boto/pull/3472.

caseylucas commented 8 years ago

I think a similar lookup problem exists (in ec2_vpc_route_table) when the route contains a vpc endpoint. When a route table includes a vpc endpoint it fails with:

fatal: [localhost]: FAILED! => {"changed": false, "failed": true, "invocation": {"module_args": {"_ansible_check_mode": false, "_ansible_debug": false, "_ansible_diff": false, "_ansible_no_log": false, "_ansible_verbosity": 3, "aws_access_key": null, "aws_secret_key": null, "ec2_url": null, "lookup": "tag", "profile": null, "propagating_vgw_ids": [], "region": "us-east-1", "route_table_id": null, "routes": [{"destination_cidr_block": "0.0.0.0/0", "gateway_id": "igw-692fee0d"}], "security_token": null, "state": "present", "subnets": ["subnet-43d42669", "subnet-631eb215", "subnet-46d4266c", "subnet-6c1eb21a"], "tags": {"Creator": "ansible", "Environment": "casey", "Name": "rt_us-east-1_casey_public_all"}, "validate_certs": true, "vpc_id": "vpc-a8ceb1cc"}, "module_name": "ec2_vpc_route_table"}, "msg": "Value (None) for parameter destinationCidrBlock is invalid. This is not a valid CIDR block."}
Etherdaemon commented 8 years ago

Hi @caseylucas - as mentioned on the other PR for endpoints I raised a fix PR #1511 for route tables to ignore the endpoints as endpoints populate their own routes.

In terms of this nat gateway one - I've taken a quick look using my original proposed nat gateway as it provides the interface ID when you utilise a wait which I then used to create a new route.

I haven't had a chance to take a look at creating a route using a nat-gateway id yet but I'll do that next.

In the meantime - this mostly works using the interface ID - in the sense that I do get a This route exists if I was to run the play again - I can get around that by putting a when statement on to hook off the create nat gateway task so that routes are only created if the nat gateway result has changed. The below play does assume that the latest commit #1511 has also been applied as well.

- name: Create nat gateway
  ec2_vpc_managed_ngw:
    state: present
    subnet_id: subnet-1bb9f67e
    region: ap-southeast-2
    profile: personal
    eip_address: 52.62.132.50
    wait: yes
  register: new_nat_gateway

- name: Debugging
  debug:
    msg: "{{ new_nat_gateway.result['NatGatewayAddresses'] }}"

- name: Add route
  ec2_vcp_route_table:
    state: present
    vpc_id: vpc-ce26b7ab
    region: ap-southeast-2
    profile: personal
    propagating_vgw_ids: None
    tags:
      Name: Public
    subnets:
      - subnet-1bb9f67e
    routes:
      - dest: 0.0.0.0/0
        interface_id: "{{ new_nat_gateway.result['NatGatewayAddresses'][0]['NetworkInterfaceId'] }}"
  # when: new_nat_gateway.changed

I'll take a look at route tables using the nat gateway id but that could be a tad trickier.

caseylucas commented 8 years ago

@Etherdaemon, interesting work around. I tried it and it works. Thanks. If you have multiple routes to include (besides the nat gateway), this work around might be a problem though because of the idempotent nature of the ec2_vpc_route_table module. Unless I'm misunderstanding something, it tries to make the routes in aws match exactly the routes in the module parameter - including removing ones that are not there.

Etherdaemon commented 8 years ago

@caseylucas yes - definetely can't leave it as is. And yes I believe it does have something to do with the checking of routes.

I also tried it using the gateway_id and my original module and that appears to work as well which probably means that @jonhadfield module should work as well using the fixes in #1511

I'll take a look at the route tables module now as the nat gateway on the second run without the when statement I do get the Route already exists error which is a pain. I agree that having the when statement is only a very temporary work around

Etherdaemon commented 8 years ago

Hi @caseylucas #1511 should have fixes in now and both nat_gateway_id and interface ID should technically work I think for the route table module. I've tested adding and removing routes on my own account and it seems to be working as far as I can tell.

aioue commented 8 years ago

+1

pierreant-p commented 8 years ago

+1

brewek commented 8 years ago

I strongly approve this idea. ;)

howinator commented 8 years ago

+1

nickallevato commented 8 years ago

+1 ty

missnebun commented 8 years ago

+1

ansibot commented 7 years ago

This repository has been locked. All new issues and pull requests should be filed in https://github.com/ansible/ansible

Please read through the repomerge page in the dev guide. The guide contains links to tools which automatically move your issue or pull request to the ansible/ansible repo.

ansibot commented 7 years ago

This issue was migrated to https://github.com/ansible/ansible/issues/29251