mlevit / aws-auto-cleanup

Programmatically delete AWS resources based on an allowlist and time to live (TTL) settings
MIT License
496 stars 55 forks source link

Question: some of resources are skipped because of whitelist though they are not in whitelist #97

Closed membra closed 2 years ago

membra commented 2 years ago

Hello,

I have couple of questions I am trying to figure out:

1) I noticed that though my whitelist looks like this as of now image

In the run logs I see that some resources are skipped because of being whitelisted,

But they are not part of any of the stacks in whitelist for example image

it is not visible that this one is not in any stack in whitelist but I can confirm it is not.

2) on the other hand some resources that ARE in CFN stacks that are in whitelist are marked as skipped by TTL (though I thought they would be marked as skipped by whitelisted)

Could you please clarify how this works? Maybe I don't understand the logic of how the tool works.

Thank you!

mlevit commented 2 years ago

Hey @membra. The way the whitelist functionality works from a CF is basically if the stack is either whitelisted or within the TTL, all its resources get added to the whitelist.

The last thing I want is a resource within the stack is removed before the stack itself as you technically could have a longer TTL for CF stacks than say Lambda functions.

membra commented 2 years ago

Thank you very much for a very prompt response!

ok, got it, so if a CF stack is either whitelisted or falls under TTL - it's resources will be listed as whitelisted. That is clear,

But as I mentioned in the 2 point - I added a CF stack into a whitelist but some of its resources are marked as skipped by TTL. What is the logic in this case?

mlevit commented 2 years ago

That generally shouldn't happen. The only resources it can't add to the whitelist are generally custom resources that don't conform to service-provider::service-name::data-type-name.

Can you check your logs to see if there's anything in there.

membra commented 2 years ago

My particular case is IAM policies

Here is the stack and a policy inside it image

Here is this one and other policies marked as skipped by TTL image

And here is this stack in whitelist image

membra commented 2 years ago

Just and addition: if i change TTL for Iam policies in dynamodb so that they were now not met policies are marked as delete (though again they are in the stack that is in whitelist) image

mlevit commented 2 years ago

@membra what's the "Type" of that resource? The screenshot cuts that bit out.

membra commented 2 years ago

AWS::IAM::ManagedPolicy

mlevit commented 2 years ago

Can you put your log_level to DEBUG in your serverless.yml file and run this again. I want to see the output for some of this.

Can't think of a reason why this policy is behaving differently.

membra commented 2 years ago

Hi,

sorry, could you please advise where exactly that config is?

mlevit commented 2 years ago

The file is /app/serverless.yml. Look for the variable log_level.

membra commented 2 years ago

And it requires redeploy yes? with npm run deploy -- --region ap-southeast-2 --aws-profile XXX

mlevit commented 2 years ago

Correct. Redeploy and then re-run. Head over to the CloudWatch logs and send me a copy of the logs.

membra commented 2 years ago

test.log

From what I found so far:

First is says: [DEBUG] IAM ManagedPolicy 'arn:aws:iam::720377698621:policy/iam-as2-gs-g-trusted-advisor' has been added to the whitelist. (cloudformation_cleanup.py, delete_stack(), line 255)

But then this: [DEBUG] IAM Policy 'iam-as2-gs-g-trusted-advisor' was detatched from IAM Role iam-as2-g-platform. (iam_cleanup.py, policies(), line 247) [DEBUG] IAM Policy 'iam-as2-gs-g-trusted-advisor' was detatched from IAM Role iam-as2-g-computer-operations. (iam_cleanup.py, policies(), line 247) [DEBUG] IAM Policy 'iam-as2-gs-g-trusted-advisor' was detatched from IAM Role iam-as2-g-security. (iam_cleanup.py, policies(), line 247) [DEBUG] IAM Policy 'iam-as2-gs-g-trusted-advisor' was detatched from IAM Role iam-as2-g-systems-assurance. (iam_cleanup.py, policies(), line 247) [INFO] IAM Policy 'iam-as2-gs-g-trusted-advisor' was last modified 16 days ago and has been deleted. (iam_cleanup.py, policies(), line 333)

Just to confirm this policy is attached to those roles image

mlevit commented 2 years ago

So this log is a little strange

[DEBUG] IAM ManagedPolicy 'arn:aws:iam::720377698621:policy/iam-as2-gs-g-trusted-advisor' has been added to the whitelist. (cloudformation_cleanup.py, delete_stack(), line 255)

It should only be iam-as2-gs-g-trusted-advisor not the full ARN. Maybe the policies are coming up different to what I expected.

Can you run the following command from a terminal and send me the results for that stack:

aws cloudformation describe-stack-resources --stack-name <your stack name> --region <your region> --output json
membra commented 2 years ago

describe.txt

the very last one here

    {
        "StackName": "StackSet-iam-roles-537abf7a-b025-4654-8038-c58bfe398038",
        "StackId": "arn:aws:cloudformation:ap-southeast-2:720377698621:stack/StackSet-iam-roles-537abf7a-b025-4654-8038-c58bfe398038/09583a40-5185-11ec-a9c2-0246fc31f5a6",
        "LogicalResourceId": "IamGTrustedAdvisorFullAccessManagedPolicy",
        "PhysicalResourceId": "arn:aws:iam::720377698621:policy/iam-as2-gs-g-trusted-advisor",
        "ResourceType": "AWS::IAM::ManagedPolicy",
        "Timestamp": "2021-11-30T02:28:21.071000+00:00",
        "ResourceStatus": "CREATE_COMPLETE",
        "DriftInformation": {
            "StackResourceDriftStatus": "IN_SYNC"
        }
    }
mlevit commented 2 years ago

@membra thanks for that. I've made a temporary fix within https://github.com/servian/aws-auto-cleanup/tree/managed-policy-fix

Can you pull that branch, deploy and run. Let me know how you go.

membra commented 2 years ago

That particular line changed but further on it still deleted those policies based on TTL test.log

[DEBUG] IAM ManagedPolicy 'iam-as2-gs-g-trusted-advisor' has been added to the whitelist. (cloudformation_cleanup.py, delete_stack(), line 260)

...

[DEBUG] IAM Policy 'iam-as2-gs-g-trusted-advisor' was detatched from IAM Role iam-as2-g-platform. (iam_cleanup.py, policies(), line 247) [DEBUG] IAM Policy 'iam-as2-gs-g-trusted-advisor' was detatched from IAM Role iam-as2-g-computer-operations. (iam_cleanup.py, policies(), line 247) [DEBUG] IAM Policy 'iam-as2-gs-g-trusted-advisor' was detatched from IAM Role iam-as2-g-security. (iam_cleanup.py, policies(), line 247) [DEBUG] IAM Policy 'iam-as2-gs-g-trusted-advisor' was detatched from IAM Role iam-as2-g-systems-assurance. (iam_cleanup.py, policies(), line 247) [INFO] IAM Policy 'iam-as2-gs-g-trusted-advisor' was last modified 16 days ago and has been deleted. (iam_cleanup.py, policies(), line 333)

mlevit commented 2 years ago

So it looks like managed policies are actually classified as ManagedPolicy resources, whereas all other policies are classified as Policy. I've added a translator to convert ManagedPolicy to Policy.

Should hopefully fix it 🤞

Pull and retest.

membra commented 2 years ago

Looks like success to me this time test.log

[DEBUG] IAM Policy 'iam-as2-gs-g-trusted-advisor' has been whitelisted and has not been deleted. (iam_cleanup.py, policies(), line 345)

Thank you very much for a very quick response and fix!

Do you think since there were issues with managed policies at filtering stage, maybe there might be issues related to deletion of those?

mlevit commented 2 years ago

Managed policies whitelisted from CF stacks would have been deleted. Whilst they were technically whitelisted, they were stored under a different category than I was looking up.

Thanks for validating the fix. I'll push the change to prod soon.