[Enhancement]: Ability to destroy all dependent resources for a spot instance if the instance is preempted by AWS

franklad commented 1 year ago

Description

I would like to be able to use terraform destroy to remove all resource that are dependent to my spot instance if the instance is preempted by AWS.

Take this as an example: Using Terraform I'm creating a aws_spot_fleet_request and a aws_route53_record. In addition there is a aws_instances data source which I configured to fetch the IP of my spot instance for the Route53 record. The apply process works perfectly. But now let's say AWS terminated the spot instance, my data source is not returning the IP anymore and hence I can't delete the Route53 record.

It would be great to have the ability to clean up all the dangling resources if the instance is terminated by AWS preemptively. If this is already possible, please let me know how it's done.

Affected Resource(s) and/or Data Source(s)

aws_instances data source

Potential Terraform Configuration

No response

References

No response

Would you like to implement a fix?

No response

justinretzolk commented 1 year ago

Hey @franklad 👋 Thank you for taking the time to raise this! Since this is more of a usage question, you may have better luck using one of our community resources, particularly the AWS provider forum. We tend to prefer those avenues for general usage/configuration questions, and focus on bug and feature request type issues here on GitHub.

That said, one thing that I did notice is that the aws_instances data source documentation warns against this usage pattern (though the behavior it mentions does seem slightly different to yours):

It's strongly discouraged to use this data source for querying ephemeral instances (e.g., managed via autoscaling group), as the output may change at any time and you'd need to re-run apply every time an instance comes up or dies.

I wish I had an answer regarding the "proper" way to do this, but don't off of the top of my head. I'm certain that the community will be able to help over in those other channels though!

ljluestc commented 1 year ago

Terraform is primarily used for defining and provisioning infrastructure, so resource cleanup based on external events (like AWS instance preemption) is typically handled by scripts, Lambda functions, or other automation tools outside of Terraform. import boto3

def lambda_handler(event, context):
    # Initialize AWS clients
    ec2_client = boto3.client('ec2')
    route53_client = boto3.client('route53')

    # Find instances marked as preempted
    response = ec2_client.describe_instances(Filters=[{'Name': 'tag:preempted', 'Values': ['true']}])
    preempted_instance_ids = [instance['InstanceId'] for reservation in response['Reservations'] for instance in reservation['Instances']]

    # Find Route 53 records with preempted instance IDs
    hosted_zone_id = 'YOUR_HOSTED_ZONE_ID'
    response = route53_client.list_resource_record_sets(HostedZoneId=hosted_zone_id)

    for record_set in response['ResourceRecordSets']:
        if 'ResourceRecords' in record_set and any(instance_id in record_set['ResourceRecords'][0]['Value'] for instance_id in preempted_instance_ids):
            # Delete the Route 53 record
            route53_client.change_resource_record_sets(
                HostedZoneId=hosted_zone_id,
                ChangeBatch={
                    'Changes': [
                        {
                            'Action': 'DELETE',
                            'ResourceRecordSet': record_set
                        }
                    ]
                }
            )

    return {
        'statusCode': 200,
        'body': 'Cleanup completed successfully.'
    }

hashicorp / terraform-provider-aws