pulumi / pulumi

Pulumi - Infrastructure as Code in any programming language 🚀
https://www.pulumi.com
Apache License 2.0
21.77k stars 1.12k forks source link

Failed to successfully update stack when changing CIDR block on VPC #7139

Open aureq opened 3 years ago

aureq commented 3 years ago

When updating a VPC IP range that contains an EC2 instance, Pulumi fails to correctly set the stack in the new desired state.

Expected behavior

The VPC and all its resources should be migrated into the new VPC.

Current behavior

The update fails and leaves both stacks (original and new) in an unusable state. 2 EC2 instances are running as well.

Steps to reproduce

In order to reproduce the issue, follow the many steps as described below as described in the 4 phases. Each phase gets its stack-state-X.json and I've included the one I generated into the attached stackstates.zip zip file.

  1. Phase 1: Actions
    • pulumi up: initial deployment
    • pulumi stack export > stack-state-0.json
    • change cidr block from 10.137.0.0/16 to 10.42.0.0/16 in Pulumi.dev.yaml
    • pulumi up
Previewing update (dev)

View Live: https://app.pulumi.com/aureq/aws-py-cidr-change/dev/previews/7871afa2-9229-4e1d-8e61-834647efa7f6

     Type                                 Name                                   Plan        Info
     pulumi:pulumi:Stack                  aws-py-cidr-change-dev                             
 +-  └─ aws:ec2:Vpc                       cidr-change-vpc                        replace     [diff: ~cidrBlock]
 ~      ├─ aws:ec2:InternetGateway        cidr-change-igw                        update      [diff: ~vpcId]
 +-     ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2a     replace     [diff: ~cidrBlock,vpcId]
 +-     │  └─ aws:ec2:Instance            cidr-change-dummy                      replace     [diff: ~subnetId,vpcSecurityGroupIds]
 +-     ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2b     replace     [diff: ~cidrBlock,vpcId]
 +-     ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2c     replace     [diff: ~cidrBlock,vpcId]
 +-     ├─ aws:ec2:SecurityGroup          cidr-change-public-sg-wg               replace     [diff: ~vpcId]
 +-     ├─ aws:ec2:SecurityGroup          cidr-change-public-sg-ssh              replace     [diff: ~vpcId]
 +-     ├─ aws:ec2:RouteTable             cidr-change-rt                         replace     [diff: ~vpcId]
 +-     ├─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2a  replace     [diff: ~routeTableId,subnetId]
 +-     ├─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2b  replace     [diff: ~routeTableId,subnetId]
 +-     └─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2c  replace     [diff: ~routeTableId,subnetId]

Outputs:
  ~ instance_id   : "i-03419037751433945" => output<string>
  ~ public_dns    : "ec2-13-211-201-1.ap-southeast-2.compute.amazonaws.com" => output<string>
  ~ public_ip     : "13.211.201.1" => output<string>
  ~ vpc_dns_server: "10.137.0.2" => "10.42.0.2"

Resources:
    ~ 1 to update
    +-11 to replace
    12 changes. 1 unchanged

Do you want to perform this update? yes
Updating (dev)

View Live: https://app.pulumi.com/aureq/aws-py-cidr-change/dev/updates/5

     Type                           Name                                Status                  Info
     pulumi:pulumi:Stack            aws-py-cidr-change-dev              **failed**              1 error
 +-  └─ aws:ec2:Vpc                 cidr-change-vpc                     replaced                [diff: ~cidrBlock]
 ~      ├─ aws:ec2:InternetGateway  cidr-change-igw                     **updating failed**     [diff: ~vpcId]; 1 error
 +-     ├─ aws:ec2:Subnet           cidr-change-subnet-ap-southeast-2a  replaced                [diff: ~cidrBlock,vpcId]
 +-     │  └─ aws:ec2:Instance      cidr-change-dummy                   replaced                [diff: ~subnetId,vpcSecurityGroupIds]
 +-     ├─ aws:ec2:Subnet           cidr-change-subnet-ap-southeast-2b  replaced                [diff: ~cidrBlock,vpcId]
 +-     ├─ aws:ec2:Subnet           cidr-change-subnet-ap-southeast-2c  replaced                [diff: ~cidrBlock,vpcId]
 +-     ├─ aws:ec2:SecurityGroup    cidr-change-public-sg-wg            replaced                [diff: ~vpcId]
 +-     └─ aws:ec2:SecurityGroup    cidr-change-public-sg-ssh           replaced                [diff: ~vpcId]

Diagnostics:
  aws:ec2:InternetGateway (cidr-change-igw):
    error: 1 error occurred:
        * updating urn:pulumi:dev::aws-py-cidr-change::aws:ec2/vpc:Vpc$aws:ec2/internetGateway:InternetGateway::cidr-change-igw: 1 error occurred:
        * Error waiting for internet gateway (igw-0619aca7c131bb260) to detach: timeout while waiting for state to become 'detached' (last state: 'detaching', timeout: 15m0s)

  pulumi:pulumi:Stack (aws-py-cidr-change-dev):
    error: update failed

Outputs:
  ~ instance_id   : "i-03419037751433945" => "i-0a9af2a027aa2b182"
  ~ public_dns    : "ec2-13-211-201-1.ap-southeast-2.compute.amazonaws.com" => "ec2-3-26-60-120.ap-southeast-2.compute.amazonaws.com"
  ~ public_ip     : "13.211.201.1" => "3.26.60.120"
  ~ vpc_dns_server: "10.137.0.2" => "10.42.0.2"

Resources:
    +-7 replaced
    1 unchanged

Duration: 15m18s
  1. Phase 2: Actions
    • pulumi stack export > stack-state-1.json
    • With the AWS console
      • the original VPC contains the original EC2 instance in running state with its one ENI -> Terminate EC2 instance
    • pulumi up --refresh
Previewing update (dev)

View Live: https://app.pulumi.com/aureq/aws-py-cidr-change/dev/previews/aaa38cbe-d32f-4d9e-b417-bfd89abfcda4

     Type                                 Name                                   Plan        Info
     pulumi:pulumi:Stack                  aws-py-cidr-change-dev                             
     └─ aws:ec2:Vpc                       cidr-change-vpc                                    [diff: ~tags]
        ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2c                 [diff: +__defaults~tags]
        ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2a                 [diff: +__defaults~tags]
        │  └─ aws:ec2:Instance            cidr-change-dummy                                  [diff: ~tags,vpcSecurityGroupIds]
        ├─ aws:ec2:SecurityGroup          cidr-change-public-sg-wg                           [diff: ~tags]
 ~      ├─ aws:ec2:InternetGateway        cidr-change-igw                        update      [diff: ~vpcId]
        ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2b                 [diff: +__defaults~tags]
        ├─ aws:ec2:SecurityGroup          cidr-change-public-sg-ssh                          [diff: ~tags]
 +-     ├─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2a  replace     [diff: ~routeTableId,subnetId]
 +-     ├─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2b  replace     [diff: ~routeTableId,subnetId]
 +-     ├─ aws:ec2:RouteTable             cidr-change-rt                         replace     [diff: ~vpcId]
 +-     └─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2c  replace     [diff: ~routeTableId,subnetId]

Resources:
    ~ 1 to update
    - 6 to delete
    +-4 to replace
    11 changes. 8 unchanged

Do you want to perform this update? yes
Updating (dev)

View Live: https://app.pulumi.com/aureq/aws-py-cidr-change/dev/updates/6

     Type                                 Name                                   Status                    Info
 ~   pulumi:pulumi:Stack                  aws-py-cidr-change-dev                 **refreshing failed**     1 error
 -   └─ aws:ec2:Vpc                       cidr-change-vpc                        **deleting failed**       1 error
 -      ├─ aws:ec2:SecurityGroup          cidr-change-public-sg-wg               deleted                   
 -      ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2b     deleted                   
 ~      ├─ aws:ec2:InternetGateway        cidr-change-igw                        refresh                   [diff: -__defaults~tags]
 -      ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2a     deleted                   
 ~      │  └─ aws:ec2:Instance            cidr-change-dummy                      refresh                   [diff: ~tags,vpcSecurityGroupIds]
 -      ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2c     deleted                   
 -      ├─ aws:ec2:SecurityGroup          cidr-change-public-sg-ssh              deleted                   
 ~      ├─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2a  refresh                   [diff: -__defaults]
 ~      ├─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2b  refresh                   [diff: -__defaults]
 ~      ├─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2c  refresh                   [diff: -__defaults]
 ~      └─ aws:ec2:RouteTable             cidr-change-rt                         refresh                   [diff: -__defaults~routes,tags]

Diagnostics:
  aws:ec2:Vpc (cidr-change-vpc):
    error: deleting urn:pulumi:dev::aws-py-cidr-change::aws:ec2/vpc:Vpc::cidr-change-vpc: 1 error occurred:
        * Error deleting VPC: DependencyViolation: The vpc 'vpc-0c91e5af1c4c1a40f' has dependencies and cannot be deleted.
        status code: 400, request id: 2a5558c2-0d92-4a70-8b1a-9982b2c1bfdf

  pulumi:pulumi:Stack (aws-py-cidr-change-dev):
    error: update failed

Outputs:
    instance_id   : "i-0a9af2a027aa2b182"
    public_dns    : "ec2-3-26-60-120.ap-southeast-2.compute.amazonaws.com"
    public_ip     : "3.26.60.120"
    vpc_dns_server: "10.42.0.2"

Resources:
    - 5 deleted

Duration: 5m6s
  1. Phase 3: Actions
    • pulumi stack export > stack-state-2.json
    • With the AWS console
      • The original VPC route table is stil associated with the original VPC, and the IGW is still attached to the original VPC -> Delete route table
    • pulumi up --refresh
Previewing update (dev)

View Live: https://app.pulumi.com/aureq/aws-py-cidr-change/dev/previews/8db3b94c-0c6a-4ccf-802b-56c9ab6450d7

     Type                                 Name                                   Plan       Info
     pulumi:pulumi:Stack                  aws-py-cidr-change-dev                            
     └─ aws:ec2:Vpc                       cidr-change-vpc                                   [diff: ~tags]
        ├─ aws:ec2:SecurityGroup          cidr-change-public-sg-wg                          [diff: ~tags]
        ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2a                [diff: +__defaults~tags]
        │  └─ aws:ec2:Instance            cidr-change-dummy                                 [diff: ~tags,vpcSecurityGroupIds]
        ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2b                [diff: +__defaults~tags]
        ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2c                [diff: +__defaults~tags]
 ~      ├─ aws:ec2:InternetGateway        cidr-change-igw                        update     [diff: ~vpcId]
        ├─ aws:ec2:SecurityGroup          cidr-change-public-sg-ssh                         [diff: ~tags]
 +      ├─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2c  create     
 +      ├─ aws:ec2:RouteTable             cidr-change-rt                         create     
 +      ├─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2a  create     
 +      └─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2b  create     

Resources:
    + 4 to create
    ~ 1 to update
    - 1 to delete
    6 changes. 8 unchanged

Do you want to perform this update? yes
Updating (dev)

View Live: https://app.pulumi.com/aureq/aws-py-cidr-change/dev/updates/7

     Type                                 Name                                   Status                    Info
 ~   pulumi:pulumi:Stack                  aws-py-cidr-change-dev                 **refreshing failed**     1 error
 -   └─ aws:ec2:Vpc                       cidr-change-vpc                        **deleting failed**       1 error
 ~      ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2a     refresh                   
 ~      │  └─ aws:ec2:Instance            cidr-change-dummy                      refresh                   
 ~      ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2c     refresh                   
 ~      ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2b     refresh                   
 ~      ├─ aws:ec2:InternetGateway        cidr-change-igw                        refresh                   
 ~      ├─ aws:ec2:SecurityGroup          cidr-change-public-sg-ssh              refresh                   
 ~      ├─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2c  refresh                   
 ~      ├─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2b  refresh                   
 ~      ├─ aws:ec2:RouteTable             cidr-change-rt                         refresh                   
 ~      ├─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2a  refresh                   
 ~      └─ aws:ec2:SecurityGroup          cidr-change-public-sg-wg               refresh                   

Diagnostics:
  aws:ec2:Vpc (cidr-change-vpc):
    error: deleting urn:pulumi:dev::aws-py-cidr-change::aws:ec2/vpc:Vpc::cidr-change-vpc: 1 error occurred:
        * Error deleting VPC: DependencyViolation: The vpc 'vpc-0c91e5af1c4c1a40f' has dependencies and cannot be deleted.
        status code: 400, request id: 3f326fa3-f5a3-4542-b0c7-654fd9aa1304

  pulumi:pulumi:Stack (aws-py-cidr-change-dev):
    error: update failed

Outputs:
    instance_id   : "i-0a9af2a027aa2b182"
    public_dns    : "ec2-3-26-60-120.ap-southeast-2.compute.amazonaws.com"
    public_ip     : "3.26.60.120"
    vpc_dns_server: "10.42.0.2"

Resources:

Duration: 11m0s
  1. Phase 4: Actions
    • pulumi stack export > stack-state-3.json
    • With the AWS console
      • The original IGS is stil attached to the original VPC -> Detach IGW
    • pulumi up --refresh
Previewing update (dev)

View Live: https://app.pulumi.com/aureq/aws-py-cidr-change/dev/previews/8e9aab56-f594-4d42-ac90-eff8b3451dff

     Type                                 Name                                   Plan       Info
     pulumi:pulumi:Stack                  aws-py-cidr-change-dev                            
     └─ aws:ec2:Vpc                       cidr-change-vpc                                   [diff: ~tags]
        ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2a                [diff: +__defaults~tags]
        │  └─ aws:ec2:Instance            cidr-change-dummy                                 [diff: ~tags,vpcSecurityGroupIds]
        ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2c                [diff: +__defaults~tags]
        ├─ aws:ec2:SecurityGroup          cidr-change-public-sg-wg                          [diff: ~tags]
        ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2b                [diff: +__defaults~tags]
        ├─ aws:ec2:SecurityGroup          cidr-change-public-sg-ssh                         [diff: ~tags]
 ~      ├─ aws:ec2:InternetGateway        cidr-change-igw                        update     [diff: ~vpcId]
 +      ├─ aws:ec2:RouteTable             cidr-change-rt                         create     
 +      ├─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2a  create     
 +      ├─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2c  create     
 +      └─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2b  create     

Resources:
    + 4 to create
    ~ 1 to update
    - 1 to delete
    6 changes. 8 unchanged

Do you want to perform this update? yes
Updating (dev)

View Live: https://app.pulumi.com/aureq/aws-py-cidr-change/dev/updates/8

     Type                                 Name                                   Status      Info
     pulumi:pulumi:Stack                  aws-py-cidr-change-dev                             
     └─ aws:ec2:Vpc                       cidr-change-vpc                                    [diff: ~tags]
        ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2a                 [diff: +__defaults~tags]
        │  └─ aws:ec2:Instance            cidr-change-dummy                                  [diff: ~tags,vpcSecurityGroupIds]
        ├─ aws:ec2:SecurityGroup          cidr-change-public-sg-wg                           [diff: ~tags]
        ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2c                 [diff: +__defaults~tags]
 ~      ├─ aws:ec2:InternetGateway        cidr-change-igw                        updated     [diff: ~vpcId]
        ├─ aws:ec2:SecurityGroup          cidr-change-public-sg-ssh                          [diff: ~tags]
        ├─ aws:ec2:Subnet                 cidr-change-subnet-ap-southeast-2b                 [diff: +__defaults~tags]
 +      ├─ aws:ec2:RouteTable             cidr-change-rt                         created     
 +      ├─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2a  created     
 +      ├─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2c  created     
 +      └─ aws:ec2:RouteTableAssociation  vpc-route-table-assoc-ap-southeast-2b  created     

Outputs:
    instance_id   : "i-0a9af2a027aa2b182"
    public_dns    : "ec2-3-26-60-120.ap-southeast-2.compute.amazonaws.com"
    public_ip     : "3.26.60.120"
    vpc_dns_server: "10.42.0.2"

Resources:
    + 4 created
    ~ 1 updated
    - 1 deleted
    6 changes. 8 unchanged

Duration: 14s
  1. Actions
    • pulumi stack export > stack-state-4.json

Code

Below is __main__.py and Pulumi.dev.yaml

"""
An AWS Python Pulumi program to securely deploy a Wireguard VPN server
"""
import base64
import ipaddress
import pulumi
import pulumi_aws as aws

awsConfig = pulumi.Config("aws")
aws_region = awsConfig.require("region")

config = pulumi.Config()

service_name = config.get('service_name') or 'cidr-change'
vpc_cidr_block = config.get('vpc_network_range') or '10.100.0.0/16'
vpc_cidr_netmask = config.get('vpc_subnets_netmask') or '255.255.255.0'
wireguard_port = config.get_int('wireguard_port') or 51820

vpc_instance_tenancy='default'
vpc_enable_dns_hostnames=True
vpc_enable_dns_support=True

instance_type = 't4g.nano'

vpc_subnet_length = bin(int(ipaddress.IPv4Address(vpc_cidr_netmask))).count("1")

def get_arch(instance_type):
    """
    The functions returns the instance architecture (arm64, i386, x86_64)
    based on the instance type.
    """
    if instance_type.startswith(('t4g', 'a1','c6g', 'm6g', 'r6g')):
        return ['arm64']
    if instance_type.startswith(('t1.micro', 't2.nano', 't2.micro', 't2.small', 't2.medium', 'm1.small','m1.medium', 'c1.medium', 'c3.large')):
        return ['i386', 'x86_64']
    return ['x86_64']

"""
Create an empty Vpc
"""
vpc_name = service_name+'-vpc'
vpc = aws.ec2.Vpc(vpc_name,
    cidr_block=vpc_cidr_block,
    instance_tenancy=vpc_instance_tenancy,
    enable_dns_hostnames=vpc_enable_dns_hostnames,
    enable_dns_support=vpc_enable_dns_support,
    tags={
        'Name': vpc_name
    },
)

vpc_dns_server = str(list(ipaddress.ip_network(vpc_cidr_block).hosts())[1])

"""
Create an internet gateway so the resulting
will have the ability to access the internet.
"""

igw_name = service_name+'-igw'
vpc_igw = aws.ec2.InternetGateway(igw_name,
    vpc_id=vpc.id,
    tags={
        'Name': igw_name
    },
    opts=pulumi.ResourceOptions(parent=vpc)
)

"""
Create a new route table to allow the traffic
to flow in and out of our Vpc.
"""
rt_name = service_name+'-rt'
vpc_route_table = aws.ec2.RouteTable(rt_name,
    vpc_id=vpc.id,
    routes=[aws.ec2.RouteTableRouteArgs(
        cidr_block='0.0.0.0/0',
        gateway_id=vpc_igw.id,
    )],
    tags={
        'Name': rt_name
    },
    opts=pulumi.ResourceOptions(parent=vpc)
)

"""
Create a subnet in each availability-zone in the
selected AWS region. Then, it associates the new
subnet to the route table create with `_route_table`
"""
vpc_subnets = []

all_zones = aws.get_availability_zones()
zone_names = all_zones.names
subnets_list = list(ipaddress.ip_network(vpc_cidr_block).subnets(new_prefix=vpc_subnet_length))

subnet_name_base = service_name+'-subnet'
for zone in zone_names:
    vpc_subnet = aws.ec2.Subnet(f'{subnet_name_base}-{zone}',
        assign_ipv6_address_on_creation=False,
        vpc_id=vpc.id,
        map_public_ip_on_launch=True,
        cidr_block=subnets_list[len(vpc_subnets)].with_prefixlen,
        availability_zone=zone,
        tags={
            'Name': f'{subnet_name_base}-{zone}',
        },
        opts=pulumi.ResourceOptions(parent=vpc)
    )

    aws.ec2.RouteTableAssociation(
        f'vpc-route-table-assoc-{zone}',
        route_table_id=vpc_route_table.id,
        subnet_id=vpc_subnet.id,
        opts=pulumi.ResourceOptions(parent=vpc)
    )
    vpc_subnets.append(vpc_subnet)

"""
Create a new security that will allow the wireguard
traffic to access our Wireguard server.
"""
vpc_public_security_groups = []

public_sg_name = service_name+'-public-sg-wg'
vpc_public_security_groups.append(aws.ec2.SecurityGroup(public_sg_name,
    vpc_id=vpc.id,
    description='Allow Wireguard client access.',
    tags={
        'Name': public_sg_name
    },
    ingress=[
        aws.ec2.SecurityGroupIngressArgs(
            cidr_blocks=[
                '0.0.0.0/0'],
            from_port=wireguard_port,
            to_port=wireguard_port,
            protocol='udp',
            description='Allow wireguard access.'
        ),
    ],
    egress=[
        aws.ec2.SecurityGroupEgressArgs(
            protocol='-1',
            from_port=0,
            to_port=0,
            cidr_blocks=[
                '0.0.0.0/0'],
        )],
    opts=pulumi.ResourceOptions(
        parent=vpc)
))

public_sg_name = service_name+'-public-sg-ssh'
vpc_public_security_groups.append(aws.ec2.SecurityGroup(public_sg_name,
    vpc_id=vpc.id,
    description='Allow admin SSH access.',
    tags={
        'Name': public_sg_name
    },
    ingress=[
        aws.ec2.SecurityGroupIngressArgs(
            cidr_blocks=[
                '0.0.0.0/0'],
            from_port=22,
            to_port=22,
            protocol='tcp',
            description='Allow SSH access.'
        ),
    ],
    egress=[
        aws.ec2.SecurityGroupEgressArgs(
            protocol='-1',
            from_port=0,
            to_port=0,
            cidr_blocks=['0.0.0.0/0'],
        )],
    opts=pulumi.ResourceOptions(
        parent=vpc)
))

"""
Find the necessary AMI to launch our instance
"""
ami = aws.ec2.get_ami(
    owners=['136693071363'],
    most_recent=True,
    filters=[
        {
            'name': 'architecture',
            'values': get_arch(instance_type)
        }
    ]
)

"""
Create our EC2 instance
"""
with open('cloud-config.yml', 'r') as file:
    user_data_base64 = base64.b64encode(
        bytearray(
            file.read().replace('__aws__region__', aws_region),
            'utf-8'
        )
    ).decode('utf-8')

[stackstates.zip](https://github.com/pulumi/pulumi/files/6543326/stackstates.zip)

ec2 = aws.ec2.Instance(service_name+'-dummy',
    instance_type=instance_type,
    ami=ami.id,
    subnet_id=vpc_subnets[0].id,
    vpc_security_group_ids=vpc_public_security_groups,
    source_dest_check=False,
    user_data_base64=user_data_base64,
    iam_instance_profile="",
    tags={
        'Name': service_name
    },
    opts=pulumi.ResourceOptions(parent=vpc_subnets[0])
)

pulumi.export('instance_id', ec2.id)
pulumi.export('public_ip', ec2.public_ip)
pulumi.export('public_dns', ec2.public_dns)
pulumi.export('vpc_dns_server', vpc_dns_server)
config:
  aws-py-cidr-change:service_name: cidr-change
  aws-py-cidr-change:vpc_network_range: 10.137.0.0/16
  aws-py-cidr-change:vpc_subnets_netmask: 255.255.240.0
  aws-py-cidr-change:wireguard_port: "51820"
  aws:region: ap-southeast-2
lukehoban commented 3 years ago

AsIt appears that the change you made to the VPC required replacement of all resources except the InternetGateway, which just needed an update.

Had that succeeded as intended, a new VPC would be created, a new instance would be created in that VPC, the old instance would be deleted, and then the old VPC would be deleted.

Error waiting for internet gateway (igw-0619aca7c131bb260) to detach: timeout while waiting for state to become 'detached' (last state: 'detaching', timeout: 15m0s)

This seems to be the root issue that led to the problems you saw. I'm unsure whether this was a one-off issue, or is consistently reproducible. It sounds like a potential eventual consistency issue in AWS that the upstream AWS provider is not handling. Though it is also not immediately clear that the InternetGateway can successfully update to move between VPCs.

I presume if this repro s consistently, there is likely a smaller repro that triggers this that might make clearer where the issue is.

aureq commented 3 years ago

yes, so that-s an additional point to the discussion. An IGW can either be replaced or recreated. Seems like we are trying to update it (detach/attach) and since there are so many other dependencies related to that resource, then this is what's causing the issue.

@lukehoban Happy to try to find a smaller repro.

aureq commented 3 years ago

It's possible to successfully update a VPC network range if there's no EC2 instance in the said VPC. Many new resources are recreated and at the end the IGW is updated. The IGW id remained the same across updates.

aureq commented 3 years ago

I also tried adding depends_on=[vpc, vpc_igw] when creating the EC2 instance and then updating the IP range of the VPC but the update failed on the IP range change.

# [...]
ec2 = aws.ec2.Instance(service_name+'-dummy',
    instance_type=instance_type,
    ami=ami.id,
    subnet_id=vpc_subnets[0].id,
    vpc_security_group_ids=vpc_public_security_groups,
    source_dest_check=False,
    user_data_base64=user_data_base64,
    iam_instance_profile="",
    tags={
        'Name': service_name
    },
    opts=pulumi.ResourceOptions(
    parent=vpc_subnets[0],
    depends_on=[vpc, vpc_igw]
    ),
)
lukehoban commented 3 years ago

It's possible to successfully update a VPC network range if there's no EC2 instance in the said VPC.

Aha - so that's the root of the problem. That's a subtle indirect dependency between those resources which is not manifest in the program or in the Pulumi/Terraform resource models.

You could likely mark your Instance as delete_before_replace=True to ensure that it is destroyed before the new instance is created. I expect that would resolve this issue (possibly also requiring the depends_on=[vpc_igw]).

aureq commented 3 years ago

I tried the delete_before_replace=True but no luck with that. The stack update is still stuck on updating the IGW (detach/attach) and the EC2 instance is still up and running.

I think it would be simpler to create a new IGW instead of having to detach the existing one when creating the new VPC. Or, have a way to indicate the EC2 instance should be terminated before updating the IGW (Well, likely there are other dependencies in between these 2 resources)

# [...]
ec2 = aws.ec2.Instance(service_name+'-dummy',
    instance_type=instance_type,
    ami=ami.id,
    subnet_id=vpc_subnets[0].id,
    vpc_security_group_ids=vpc_public_security_groups,
    source_dest_check=False,
    user_data_base64=user_data_base64,
    iam_instance_profile="",
    tags={
        'Name': service_name
    },
    opts=pulumi.ResourceOptions(
        parent=vpc_subnets[0],
        depends_on=[vpc, vpc_igw],
        delete_before_replace=True,
    ),
)