GoogleCloudPlatform / k8s-config-connector

GCP Config Connector, a Kubernetes add-on for managing GCP resources
https://cloud.google.com/config-connector/docs/overview
Apache License 2.0
890 stars 218 forks source link

Error when deleting StorageBucketAccessControl pointing to deleted ServiceAccount #39

Closed bogdanpetrea closed 4 years ago

bogdanpetrea commented 5 years ago
Spec:
  Bucket Ref:
    Name:  test
  Entity:  user-test@test-project.iam.gserviceaccount.com
  Role:    OWNER
Status:
  Conditions:
    Last Transition Time:  2019-09-13T08:47:28Z
    Message:               Delete call failed: googleapi: Error 400: Unknown user email address: test@test-project.iam.gserviceaccount.com, invalid
    Reason:                DeleteFailed
    Status:                False
    Type:                  Ready
  Email:                   test@test-project.iam.gserviceaccount.com
  Id:                      test/user-test@test-project.iam.gserviceaccount.com

The test@test-project.iam.gserviceaccount.com ServiceAccount has been deleted, so the StorageBucketAccessControl becomes useless at this point, but it cannot be deleted.

AlexBulankou commented 5 years ago

Thanks @bogdanpetrea, I was able to repro this.

  1. Create the following resources:

    apiVersion: storage.cnrm.cloud.google.com/v1alpha2
    kind: StorageBucketAccessControl
    metadata:
    labels:
    label-one: "value-one"
    name: repro39-storagebucketaccesscontrol
    spec:
    bucketRef:
    name: repro39-storagebucket
    entity: serviceAccount:repro39-iamserviceaccount@project-id.iam.gserviceaccount.com
    role: READER
    ---
    apiVersion: storage.cnrm.cloud.google.com/v1alpha2
    kind: StorageBucket
    metadata:
    name: repro39-storagebucket
    ---
    apiVersion: iam.cnrm.cloud.google.com/v1alpha1
    kind: IAMServiceAccount
    metadata:
    labels:
    label-one: "value-one"
    name: repro39-iamserviceaccount
    spec:
    displayName: Example Service Account
  2. Verify storagebucketaccesscontrol was created: kubectl describe StorageBucketAccessControl

  3. Delete service account:

    $ kubectl delete iamserviceaccount repro39-iamserviceaccount
    iamserviceaccount.iam.cnrm.cloud.google.com "repro39-iamserviceaccount" deleted
  4. Try deleting storagebucketaccesscontrol:

    $ kubectl delete storagebucketaccesscontrol repro39-storagebucketaccesscontrol

Expected: it is deleted Actual: it hangs and CNRM controller pod logs are reporting errors

bogdanpetrea commented 4 years ago

Any idea if this has been fixed in any of the newer versions?

I've encountered a similar error for IAMPolicy after deleting the IAMServiceAccount and I'm trying to find out if it's worth migrating to a newer version.

Name:         test-default
Namespace:    test
API Version:  iam.cnrm.cloud.google.com/v1alpha1
Kind:         IAMPolicy
Metadata:
...
  Finalizers:
    cnrm.cloud.google.com/finalizer
...
Spec:
  Bindings:
    Members:
      serviceAccount:test.svc.id.goog[test/default]
    Role:  roles/iam.workloadIdentityUser
  Resource Ref:
    API Version:  iam.cnrm.cloud.google.com/v1alpha1
    Kind:         IAMServiceAccount
    Name:         test
Status:
  Conditions:
    Last Transition Time:  2020-01-30T12:04:11Z
    Message:               error clearing IAM policy: error retrieving resource 'test/test' with gvk 'iam.cnrm.cloud.google.com/v1alpha1, Kind=IAMServiceAccount': iamserviceaccounts.iam.cnrm.cloud.google.com "test" not found
    Status:                False
    Type:                  Ready
Events:                    <none>
maqiuyujoyce commented 4 years ago

Hi @bogdanpetrea, the issue still exists in the newest version.

kibbles-n-bytes commented 4 years ago

Hey @bogdanpetrea , thanks for flagging these.

The IAMPolicy issue is a regression that we'll be fixing for next week's release.

As for StorageBucketAccessControl, that resource had been on a special code path that prevented us from solving this issue properly. As of last week, this blocker has been resolved, and we'll work to fix this issue similarly for next week's release.

bogdanpetrea commented 4 years ago

Thanks for the good news @kibbles-n-bytes! Just wanted to mention that I am using a slightly older version of the manager controller-revision-hash=cnrm-controller-manager-ddc5f7774, don't want to trigger a false regression alarm. :)

sprokhorov commented 4 years ago

Hey @kibbles-n-bytes do you have any estimation for the next release?

kibbles-n-bytes commented 4 years ago

We had to delay the fix slightly; the new ETA is end of next week.

spew commented 4 years ago

This fix is in the 1.5.0 release.