GoogleCloudPlatform / k8s-config-connector

GCP Config Connector, a Kubernetes add-on for managing GCP resources
https://cloud.google.com/config-connector/docs/overview
Apache License 2.0
893 stars 222 forks source link

Dangling policy members when IAMServiceAccount is deleted before IAMPolicyMember #384

Open milowe opened 3 years ago

milowe commented 3 years ago

Describe the bug Failing to delete member from policy when IAMPolicyMember (refereeing IAMServiceAccount using memberFrom) is deleted after the IAMServiceAccount is deleted. Using tooling like kpt live apply and kpt live destroy do not allow for any control in what order resources are processed. Repeated usage will in the end give error that max allowed members in policy reached.

ConfigConnector Version 1.34.0

To Reproduce Create an IAMServiceAccount Create an IAMPolicyMember with any permission setting memberFrom to the created IAMServiceAccount Delete the IAMServiceAccount Delete the IAMPolicyMember Examine IAM permission in the console. It will show the added permission with the deleted: IAMServiceAccount as member.

xiaobaitusi commented 3 years ago

Hi @milowe

Just to clarify, the IAMPolicyMember CRs were deleted from the k8s Cluster, however in the underlying referenced GCP resource, it will still have that IAMServiceAccount as a member, though the member has been marked as deleted. Is this the case with your issue?

milowe commented 3 years ago

@xiaobaitusi Yes, that is correct. And that will eventually block adding new members to a policy due to max allowed members in policy reached.

xiaobaitusi commented 3 years ago

Thanks for the confirmation. We will look into this and post an update when we have more information.

xiaobaitusi commented 3 years ago

Hi @milowe, we have discussed the issue internally today and still debated on the best fix. Extra context on the API behaviour: deleting the IAM service accounts does not immediately purge associated permissions, but prefixes them with deleted: for ~30 days before deleting them.

At the same time, there are some workaround that you can try: setup a cron job to clean up the deleted members periodically.

In the long-term, once we support merging members in IAMPolicy, we would recommend to use IAMPolicy to avoid this issue. See terraform doc on this topic: https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/iam_deleted_members

karlkfi commented 2 years ago

If you need to enforce ordering and are using kpt (v1.0.0-beta.7+) or ConfigSync (v1.11.0+ - coming soon) to apply your resources, you might look into the new depends-on annotation:

depends-on allows deletion in reverse dependency order. So you could annotate your IAMPolicyMember as depending on the IAMServiceAccount and kpt/ConfigSync would handle apply/deletion ordering.

This has a few limitations, but may get you moving:

griseau commented 1 year ago

Hey @xiaobaitusi do you have any update to provide on this issue ? We're facing the same problem but don't use kpt

goutamtadi1 commented 6 months ago

We are neither using kpt not IAMPolicy. We are still facing the issue when we delete the IAMServiceAccount first and IAMPolicyMember next. We are not setting the order explicitly, but we use the single YAML file to delete the resources, this happens sometimes. The deleted:svcaccount is eating up a lot of policy space and unable to create new ones after deleting the old ones. We are already hitting 64KB policy size due to this.

Is there any expected timeline to fix this?

gemmahou commented 5 months ago

I'm able to reproduce the issue. The KCC managed resource is implemented using Terraform provider, in this TF doc it explains the root cause of this issue and suggests some alternative solutions: https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/iam_deleted_members#_iam_member-resources-and-inconsistent-results

In summary, I feel there are several ways to avoid the issue:

  1. If you are using kpt or config controller, there's a way in @karlkfi's comments to set resource dependency, so we can make sure that IAMPolicyMember is deleted before IAMServiceAccount, that would avoid the TF limitation where Member resource may not remove delete: IAMServiceAccount properly.
  2. We also recommend using IAMPolicy to manage create/delete of your IAM members, that's also suggested in the above documentation and @xiaobaitusi's comment.
  3. If you are not using kpt or config controller or IAMPolicy, you'll need to manually remove the delete: IAMServiceAccount in Cloud Console, or setup a periodic job to do the work. To avoid this issue in the future, you'll probably need to make sure IAMPolicyMember is deleted before IAMServiceAccount. We do not have the "depends on" logic implemented in KCC for now, since we have it already in Config Controller. I will split the configuration files and manually delete them in the correct order.

At the same time, KCC team is exploring new strategies to handle resources dependencies.