boto / boto3

AWS SDK for Python
https://aws.amazon.com/sdk-for-python/
Apache License 2.0
9.07k stars 1.87k forks source link

cloudformation `describe_stacks` blocks stack deletion #4192

Closed anatol-ju closed 3 months ago

anatol-ju commented 4 months ago

Describe the bug

I tried to find an answer for this, but it seems it's just a weird behaviour.

I'm running a Lambda to delete CloudFormation stacks using boto3. For this, the Lambda is triggered once a Glue job is finished. If successful, the Lambda should delete the stack containing the Glue job and the stack containing the Lambda afterwards.

When checking AWS Console, I can see that the deletion was triggered (DELETE_IN_PROGRESS), but the stack is returned to CREATE_COMPLETE status within 2 seconds.

Expected Behavior

When triggering deletion of a stack, the waiter (that is calling client.describe_stacks function) should not interfere with the deletion of the stack.

Current Behavior

The deletion is triggered, but the stack returned to it's previous state.

image

In the Events -> Status reason column in the console, you can see this next to CREATE_COMPLETE:

Export DataLakehouse-dev-Test-JobStack:ExportsOutputRefDataLakehousedevTestJobStack[...] cannot be deleted as it is in use by DataLakehouse-dev-Test-DestroyerStack

Reproduction Steps

This is the code I'm using:

client = boto3.client("cloudformation")
waiter = client.get_waiter("stack_delete_complete")

# os.environ["STACK_NAMES"] = "DataLakehouse-dev-Test-JobStack,DataLakehouse-dev-Test-DestroyerStack"
stack_names = set(os.environ["STACK_NAMES"].split(","))

for stack_name in stack_names:
    client.delete_stack(StackName=stack_name)
    try:
        waiter.wait(StackName=stack_name,
                    WaiterConfig={
                        'Delay': 30,
                        'MaxAttempts': 2
                    })
    except boto3.exceptions.Boto3Error:
        raise

Alternatively, I tried polling the status in a loop with:

stack_status = client.describe_stacks(StackName=active_stack)["Stacks"][0]["StackStatus"]

This had the same effect.

Possible Solution

No response

Additional Information/Context

No response

SDK version used

1.34.139

Environment details (OS name and version, etc.)

macOS, Sonoma 14.5, Python 3.12

tim-finnigan commented 4 months ago

Thanks for reaching out. I tried to reproduce this issue but was unable too. When running your code snippet it successfully deleted my stack. Do your stack events note any reason why this might be happening? (You can see those in the console or run describe_stack_events.)

Also in terms of error handling here, I recommend using except botocore.exceptions.ClientError as error:. If still seeing an issue could you provide the debug logs (with any sensitive info redacted) by adding boto3.set_stream_logger('') to your script?

anatol-ju commented 4 months ago

@tim-finnigan Thanks a lot for checking! So, it seems the error only happens when you try to delete a stack from inside another stack. Yesterday I was able to modify my infrastructure code so it deletes the same stack that executes the code. I was hoping it would be possible to do the same from a different stack (so delete a set of stacks as specified and afterwards delete the "destroyer stack"). Maybe there is an underlying limitation from AWS that I'm not aware of.

Regarding stack events, this is the only thing I can find:

In the Events -> Status reason column in the console, you can see this next to CREATE_COMPLETE: Export DataLakehouse-dev-Test-JobStack:ExportsOutputRefDataLakehousedevTestJobStack[...] cannot be deleted as it is in use by DataLakehouse-dev-Test-DestroyerStack

tim-finnigan commented 3 months ago

Thanks for following up. You can delete stack instances from a stack set as documented here: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/stackinstances-delete.html. Does that address your use case? Otherwise please clarify and refer to this earlier comment:

Also in terms of error handling here, I recommend using except botocore.exceptions.ClientError as error:. If still seeing an issue could you provide the debug logs (with any sensitive info redacted) by adding boto3.set_stream_logger('') to your script?

anatol-ju commented 3 months ago

@tim-finnigan Thanks Tim! My issue above resulted from the attempt to create a piece of infrastructure that can delete existing stacks (or itself). I created a CDK stack now that is able to delete itself after a condition is met. This is done through a Lambda function that runs boto3 code. However, it seems not possible to delete other stacks, for example by providing the stack names. I will modify my code as you suggested and try to give you more information when I have a bit more free time.

github-actions[bot] commented 3 months ago

Greetings! It looks like this issue hasn’t been active in longer than five days. We encourage you to check if this is still an issue in the latest release. In the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or upvote with a reaction on the initial post to prevent automatic closure. If the issue is already closed, please feel free to open a new one.