rebuy-de / aws-nuke

Nuke a whole AWS account and delete all its resources.
MIT License
5.51k stars 709 forks source link

CloudWatchLogsLogGroup - stuck waiting #500

Open mbergal-idx opened 4 years ago

mbergal-idx commented 4 years ago
/aws/eks/idx-eks-pulumi-test-eksCluster-6586a80/cluster - waiting Removal requested: 1 waiting, 0 failed, 74 skipped, 95 finished 
us-east-2 - CloudWatchLogsLogGroup - /aws/eks/idx-eks-pulumi-test-eksCluster-6586a80/cluster - waiting Removal requested: 1 waiting, 0 failed, 74 skipped, 95 finished

Not sure what to add, but I able to delete this log group manually

mbergal-idx commented 4 years ago

Maybe because when request to delete it is made, EKS cluster is still writing to it?

rbbrasil commented 4 years ago

@mbergal-idx which release are you running?

I've just got this problem with the v2.14.0. With v2.14.0 the removal went fine.


ap-southeast-2 - CloudWatchLogsLogGroup - /aws/vpc/mgmt-VPC/flow-logs - waiting

Removal requested: 1 waiting, 0 failed, 68 skipped, 0 finished

ap-southeast-2 - CloudWatchLogsLogGroup - /aws/vpc/mgmt-VPC/flow-logs - removed

Removal requested: 0 waiting, 0 failed, 68 skipped, 1 finished`
rajivchirania commented 4 years ago

@mbergal-idx Your above comment contains the same version.. I am also using the version v2.14.0 and having the same problem with eks cluster deletion and cloudwatch log group deletion..

rbbrasil commented 4 years ago

Oops! That was a typo. Sorry.

The Docker image version that worked for me was 2.12.0.

mbergal-idx commented 4 years ago

I am using 2.14.0, deleting single log group works fine, but if log group is associated with the cluster it does not get deleted for some reason.

svenwltr commented 4 years ago

Hello.

Sorry for the late response. Can you give us a hint to to reproduce this error?

mbergal-idx commented 4 years ago

This happens if this is EKS cluster's log group. I think it gets deleted but cluster recreates it since it takes more time for it to get deleted. If this is not enough I might be able to create a simple pulumi script to make a repro.

rickepnet commented 4 years ago

I have the same issue EKS cluster create a cloudwatch log group if I delete the log group by hand it deletes just fine. v2.13.0 I will try to upgrade see if it helps.

jbarnes commented 3 years ago

👍🏻 on this issue. Restarting the tool against the account in question, the log group is successfully deleted no problems.

This is a bit of an issue when using automation as the tool will simply recycle the deletion process and continue to fail. Not sure if I can set a retry count and exit with error or not but that would be preferable.

v2.14.0 is the version I am using, also in ap-southeast-2 region.

artem-nefedov commented 3 years ago

We're also seeing this issue, and it's also caused by EKS log groups.

Reproduction steps should be:

  1. Create any cluster with eksctl tool and make sure to enable control plane logging to cloudwatch
  2. Run aws-nuke on the account

Probably the important part is that cluster still exists when aws-nuke is run, and is also deleted in the process. Log groups from previously deleted clusters do not cause this issue.

svenwltr commented 3 years ago

I think it gets deleted but cluster recreates it since it takes more time for it to get deleted.

I follow this idea. Unfortunately I do not see a obvious solution to this.

ivan-sukhomlyn commented 2 years ago

I have the same situation with the 2.15 version. CW Log Group is in a stuck state. Logs:

13:35:15  us-west-2 - CloudWatchLogsLogGroup - /aws/eks/{ cluster_name }/cluster - waiting
13:35:15  
13:35:15  Removal requested: 1 waiting, 0 failed, 147 skipped, 64 finished
13:35:15  
13:35:20  us-west-2 - CloudWatchLogsLogGroup - /aws/eks/{ cluster_name }/cluster - waiting
13:35:20  
13:35:20  Removal requested: 1 waiting, 0 failed, 147 skipped, 64 finished
13:35:20  
13:35:25  us-west-2 - CloudWatchLogsLogGroup - /aws/eks/{ cluster_name }/cluster - waiting
13:35:25  
13:35:25  Removal requested: 1 waiting, 0 failed, 147 skipped, 64 finished
wushingmushine commented 2 years ago

Still seeing aws-nuke (v 2.17) indefinitely hanging when deleting cloudwatch log groups. If I cancel the aws-nuke run and re-run, the log group deletes immediately without issue.

This is unrelated to EKS for me.

Recreate using boto3 to create log group:

mySession =  boto3.Session(
    aws_access_key_id=accessKeyId, aws_secret_access_key=secretAccessKey
)
logsClient = mySession.client('logs', region_name='eu-west-1')
log_group_response = logsClient.create_log_group(
    logGroupName='all-rejected-traffic'
)

in eu-west-1 in a completely wiped account. The log group is then targeted by a flow log on the default VPC. Then running aws-nuke gives

Removal requested: 1 waiting, 0 failed, 296 skipped, 21 finished

eu-west-1 - CloudWatchLogsLogGroup - all-rejected-traffic - waiting

Removal requested: 1 waiting, 0 failed, 296 skipped, 21 finished

eu-west-1 - CloudWatchLogsLogGroup - all-rejected-traffic - waiting

Removal requested: 1 waiting, 0 failed, 296 skipped, 21 finished

eu-west-1 - CloudWatchLogsLogGroup - all-rejected-traffic - waiting

It seems aws-nuke is getting stuck in some indefinite cycle?

leiarenee commented 1 year ago

Infinitely waiting aws-nuke version 2.22.1

eu-west-1 - CloudWatchEventsRule - Rule: AutoScalingManagedRule - failed
eu-west-1 - CloudWatchLogsLogGroup - /aws/eks/otomi/cluster - [CreatedTime: "168334--183", LastEvent: "2023-05-06T06:20:34+02:00", logGroupName: "/aws/eks/otomi/cluster", tag:created_by: "terragrunt", tag:k8s: "custom", tag:workspace: "testing"] - waiting
eu-west-1 - CloudWatchEventsTarget - Rule: AutoScalingManagedRule Target ID: autoscaling - waiting

Removal requested: 2 waiting, 2 failed, 93 skipped, 181 finished

after re-run it terminates gracefully.

chronicc commented 1 year ago

Still failing with v2.22.1.15.ge45750a

eu-central-1 - CloudWatchEventsRule - Rule: AutoScalingManagedRule - failed
eu-central-1 - CloudWatchEventsTarget - Rule: AutoScalingManagedRule Target ID: autoscaling - waiting

Removal requested: 1 waiting, 1 failed, 557 skipped, 1387 finished
martivo commented 12 months ago

Still same issue, using: quay.io/rebuy/aws-nuke:v2.23.0 I have two eks clusters, each have one CloudWatchLogsLogGroup. One of them always gets deleted, the other stays in infinite loop.

nileshgadgi commented 9 months ago

I am facing the same issue also, my github workflow took 6 hour to delete the resources, It automatically got cancelled due to default timeout, it was stuck in the CloudWatchLogGroup Deletion.

Github action gives 3000 minutes free for free organization account and I'm getting the same error every time and I have to compromise with github time limits.

eu-west-1 - CloudWatchEventsTarget - Rule: AutoScalingManagedRule Target ID: autoscaling - waiting
eu-west-1 - CloudWatchEventsRule - Rule: AutoScalingManagedRule - waiting
eu-west-1 - CloudWatchLogsLogGroup - /aws/eks/eks-velero-cluster/cluster - [CreatedTime: "1695364752907", LastEvent: "2023-09-22T22:38:20Z", logGroupName: "/aws/eks/eks-velero-cluster/cluster", tag:Environment: "velero", tag:Managedby: "hello@clouddrove.com", tag:Name: "eks-velero-cluster", tag:Repository: "https://github.com/clouddrove/terraform-aws-eks"] - waiting

Removal requested: 3 waiting, 0 failed, 930 skipped, 105 finished

eu-west-1 - CloudWatchEventsTarget - Rule: AutoScalingManagedRule Target ID: autoscaling - waiting
eu-west-1 - CloudWatchEventsRule - Rule: AutoScalingManagedRule - waiting
eu-west-1 - CloudWatchLogsLogGroup - /aws/eks/eks-velero-cluster/cluster - [CreatedTime: "1695364752907", LastEvent: "2023-09-22T22:38:20Z", logGroupName: "/aws/eks/eks-velero-cluster/cluster", tag:Environment: "velero", tag:Managedby: "hello@clouddrove.com", tag:Name: "eks-velero-cluster", tag:Repository: "https://github.com/clouddrove/terraform-aws-eks"] - waiting

Removal requested: 3 waiting, 0 failed, 930 skipped, 105 finished

eu-west-1 - CloudWatchEventsTarget - Rule: AutoScalingManagedRule Target ID: autoscaling - waiting
eu-west-1 - CloudWatchEventsRule - Rule: AutoScalingManagedRule - waiting
eu-west-1 - CloudWatchLogsLogGroup - /aws/eks/eks-velero-cluster/cluster - [CreatedTime: "1695364752907", LastEvent: "2023-09-22T22:38:20Z", logGroupName: "/aws/eks/eks-velero-cluster/cluster", tag:Environment: "velero", tag:Managedby: "hello@clouddrove.com", tag:Name: "eks-velero-cluster", tag:Repository: "https://github.com/clouddrove/terraform-aws-eks"] - waiting

Removal requested: 3 waiting, 0 failed, 930 skipped, 105 finished

eu-west-1 - CloudWatchEventsTarget - Rule: AutoScalingManagedRule Target ID: autoscaling - waiting
eu-west-1 - CloudWatchEventsRule - Rule: AutoScalingManagedRule - waiting
eu-west-1 - CloudWatchLogsLogGroup - /aws/eks/eks-velero-cluster/cluster - [CreatedTime: "1695364752907", LastEvent: "2023-09-22T22:38:20Z", logGroupName: "/aws/eks/eks-velero-cluster/cluster", tag:Environment: "velero", tag:Managedby: "hello@clouddrove.com", tag:Name: "eks-velero-cluster", tag:Repository: "https://github.com/clouddrove/terraform-aws-eks"] - waiting

Removal requested: 3 waiting, 0 failed, 930 skipped, 105 finished

eu-west-1 - CloudWatchEventsTarget - Rule: AutoScalingManagedRule Target ID: autoscaling - waiting
eu-west-1 - CloudWatchEventsRule - Rule: AutoScalingManagedRule - waiting
eu-west-1 - CloudWatchLogsLogGroup - /aws/eks/eks-velero-cluster/cluster - [CreatedTime: "1695364752907", LastEvent: "2023-09-22T22:38:20Z", logGroupName: "/aws/eks/eks-velero-cluster/cluster", tag:Environment: "velero", tag:Managedby: "hello@clouddrove.com", tag:Name: "eks-velero-cluster", tag:Repository: "https://github.com/clouddrove/terraform-aws-eks"] - waiting

Removal requested: 3 waiting, 0 failed, 930 skipped, 105 finished

eu-west-1 - CloudWatchEventsTarget - Rule: AutoScalingManagedRule Target ID: autoscaling - waiting
eu-west-1 - CloudWatchEventsRule - Rule: AutoScalingManagedRule - waiting
eu-west-1 - CloudWatchLogsLogGroup - /aws/eks/eks-velero-cluster/cluster - [CreatedTime: "1695364752907", LastEvent: "2023-09-22T22:38:20Z", logGroupName: "/aws/eks/eks-velero-cluster/cluster", tag:Environment: "velero", tag:Managedby: "hello@clouddrove.com", tag:Name: "eks-velero-cluster", tag:Repository: "https://github.com/clouddrove/terraform-aws-eks"] - waiting

Removal requested: 3 waiting, 0 failed, 930 skipped, 105 finished

eu-west-1 - CloudWatchEventsTarget - Rule: AutoScalingManagedRule Target ID: autoscaling - waiting
eu-west-1 - CloudWatchEventsRule - Rule: AutoScalingManagedRule - waiting
eu-west-1 - CloudWatchLogsLogGroup - /aws/eks/eks-velero-cluster/cluster - [CreatedTime: "1695364752907", LastEvent: "2023-09-22T22:38:20Z", logGroupName: "/aws/eks/eks-velero-cluster/cluster", tag:Environment: "velero", tag:Managedby: "hello@clouddrove.com", tag:Name: "eks-velero-cluster", tag:Repository: "https://github.com/clouddrove/terraform-aws-eks"] - waiting

Removal requested: 3 waiting, 0 failed, 930 skipped, 105 finished

image

lucazz commented 4 months ago

Tagging along with the same issue here. Tearing down EKS clusters and their respective dependencies fails because it can't delete the Cloudwatch log group.