splunk / splunk-operator

Splunk Operator for Kubernetes
Other
210 stars 115 forks source link

Clean Up AWS Resources After Cluster Deletion #1396

Closed vivekr-splunk closed 3 days ago

vivekr-splunk commented 1 week ago

Clean Up AWS Resources After Cluster Deletion

Description

This PR introduces automation to clean up any remaining AWS resources, specifically Security Groups and OIDC IDs, after a Kubernetes cluster deletion. The objective is to ensure no orphaned resources are left behind once the pipeline completes, preventing unnecessary resource usage and avoiding potential security risks associated with leftover configurations.

Changes

Why This Is Needed

Leaving behind AWS Security Groups and OIDC configurations can lead to:

Testing

Additional Notes

Please review the resource deletion logic to ensure it aligns with existing resource tagging conventions and does not inadvertently delete in-use resources in shared environments.


This PR will help maintain a clean AWS environment and improve resource efficiency in our CI/CD pipeline.

vivekr-splunk commented 1 week ago

I think security group need to be deleted later once cluster gets cleared ,i will test it out

On Tue, Nov 12, 2024 at 9:54 AM Arjun Kondur @.***> wrote:

@.**** commented on this pull request.

In test/deploy-eks-cluster.sh https://github.com/splunk/splunk-operator/pull/1396#discussion_r1838540350 :

@@ -21,6 +21,22 @@ if [[ -z "${EKS_CLUSTER_K8_VERSION}" ]]; then fi

function deleteCluster() {

  • echo "Cleanup role, security-group, open-id ${TEST_CLUSTER_NAME}"
  • account_id=$(aws sts get-caller-identity --query "Account" --output text)
  • rolename=$(echo ${TEST_CLUSTERNAME} | awk -F- '{print "EBS" $(NF-1) "_" $(NF)}')
  • role_attached_policies=$(aws iam list-attached-role-policies --role-name $rolename --query 'AttachedPolicies[*].PolicyArn' --output text)
  • for policy_arn in ${role_attached_policies};
  • do
  • aws iam detach-role-policy --role-name ${rolename} --policy-arn ${policy_arn}
  • done
  • aws iam delete-role --role-name ${rolename}
  • oidc_id=$(aws eks describe-cluster --name ${TEST_CLUSTER_NAME} --query "cluster.identity.oidc.issuer" --output text | cut -d '/' -f 5)
  • aws iam delete-open-id-connect-provider --open-id-connect-provider-arn arn:aws:iam::${account_id}:oidc-provider/${oidc_id}
  • security_group_id=$(aws eks describe-cluster --name ${TEST_CLUSTER_NAME} --query "cluster.resourcesVpcConfig.securityGroupIds[0]" --output text)
  • aws ec2 delete-security-group --group-id ${security_group_id}

Looks like security groups also have dependent objects:

An error occurred (DependencyViolation) when calling the DeleteSecurityGroup operation: resource sg-0158103d0e31e7ef1 has a dependent object

— Reply to this email directly, view it on GitHub https://github.com/splunk/splunk-operator/pull/1396#pullrequestreview-2430323471, or unsubscribe https://github.com/notifications/unsubscribe-auth/AWRQER6FAFPXUDT5DHDQNQL2AI6FBAVCNFSM6AAAAABRN67QWWVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDIMZQGMZDGNBXGE . You are receiving this because you were assigned.Message ID: @.***>

akondur commented 1 week ago

I tried deleting it post cluster deletion as well and it fails:

. . . 2024-11-12 16:01:18 [✔] all cluster resources were deleted Cluster eks-sok-smoke-test-cluster-managersecret-109839685 deleted successfully . . Deleting security group An error occurred (DependencyViolation) when calling the DeleteSecurityGroup operation: resource sg-0e17b70c84157efe2 has a dependent object

I believe it may have network interfaces linked to it.

Thanks and Regards, Arjun Kondur

On Tue, Nov 12, 2024 at 12:01 PM vivekr-splunk @.***> wrote:

I think security group need to be deleted later once cluster gets cleared ,i will test it out

On Tue, Nov 12, 2024 at 9:54 AM Arjun Kondur @.***> wrote:

@.**** commented on this pull request.

In test/deploy-eks-cluster.sh < https://github.com/splunk/splunk-operator/pull/1396#discussion_r1838540350>

:

@@ -21,6 +21,22 @@ if [[ -z "${EKS_CLUSTER_K8_VERSION}" ]]; then fi

function deleteCluster() {

  • echo "Cleanup role, security-group, open-id ${TEST_CLUSTER_NAME}"
  • account_id=$(aws sts get-caller-identity --query "Account" --output text)
  • rolename=$(echo ${TEST_CLUSTERNAME} | awk -F- '{print "EBS" $(NF-1) "_" $(NF)}')
  • role_attached_policies=$(aws iam list-attached-role-policies --role-name $rolename --query 'AttachedPolicies[*].PolicyArn' --output text)
  • for policy_arn in ${role_attached_policies};
  • do
  • aws iam detach-role-policy --role-name ${rolename} --policy-arn ${policy_arn}
  • done
  • aws iam delete-role --role-name ${rolename}
  • oidc_id=$(aws eks describe-cluster --name ${TEST_CLUSTER_NAME} --query "cluster.identity.oidc.issuer" --output text | cut -d '/' -f 5)
  • aws iam delete-open-id-connect-provider --open-id-connect-provider-arn arn:aws:iam::${account_id}:oidc-provider/${oidc_id}
  • security_group_id=$(aws eks describe-cluster --name ${TEST_CLUSTER_NAME} --query "cluster.resourcesVpcConfig.securityGroupIds[0]" --output text)
  • aws ec2 delete-security-group --group-id ${security_group_id}

Looks like security groups also have dependent objects:

An error occurred (DependencyViolation) when calling the DeleteSecurityGroup operation: resource sg-0158103d0e31e7ef1 has a dependent object

— Reply to this email directly, view it on GitHub < https://github.com/splunk/splunk-operator/pull/1396#pullrequestreview-2430323471>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/AWRQER6FAFPXUDT5DHDQNQL2AI6FBAVCNFSM6AAAAABRN67QWWVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDIMZQGMZDGNBXGE>

. You are receiving this because you were assigned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/splunk/splunk-operator/pull/1396#issuecomment-2471219905, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQRH3Z2DX47GXYKYATQLGF32AI66RAVCNFSM6AAAAABRN67QWWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZRGIYTSOJQGU . You are receiving this because your review was requested.Message ID: @.***>