GoogleCloudPlatform / k8s-config-connector

GCP Config Connector, a Kubernetes add-on for managing GCP resources
https://cloud.google.com/config-connector/docs/overview
Apache License 2.0
893 stars 220 forks source link

Enhancement: Way to lookup and reference service accounts with project numbers in their name #340

Open karlkfi opened 3 years ago

karlkfi commented 3 years ago

In order to use GKE on a Shared VPC, you must configure GCP role bindings (IAMPolicyMember) against GCP service accounts that are created by GCP automatically. These GSAs have the project number in their name.

In order to avoid needing to template or interpolate these resources, KCC should provide a way to look up the service accounts and reference them in IAMPolicyMembers.

Examples:

# Grant GKE in the "platform-project" (123456789012) project permission to use the Shared VPC
apiVersion: iam.cnrm.cloud.google.com/v1beta1
kind: IAMPolicyMember
metadata:
  name: gke-platform-project-network-user
  namespace: projects
  annotations:
    cnrm.cloud.google.com/project-id: admin-project
spec:
  member: "serviceAccount:service-123456789012@container-engine-robot.iam.gserviceaccount.com"
  role: roles/compute.networkUser
  resourceRef:
    apiVersion: resourcemanager.cnrm.cloud.google.com/v1beta1
    kind: Project
    name: network-project

---

# Grant "Google APIs Service Agent" in the "platform-project" (123456789012) project permission to use the Shared VPC
# https://cloud.google.com/iam/docs/service-accounts#google-managed
apiVersion: iam.cnrm.cloud.google.com/v1beta1
kind: IAMPolicyMember
metadata:
  name: gcp-platform-project-network-user
  namespace: projects
  annotations:
    cnrm.cloud.google.com/project-id: admin-project
spec:
  member: "serviceAccount:123456789012@cloudservices.gserviceaccount.com"
  role: roles/compute.networkUser
  resourceRef:
    apiVersion: resourcemanager.cnrm.cloud.google.com/v1beta1
    kind: Project
    name: network-project

--- 

# Grant GKE in the "platform-project" (123456789012) project permission to act as the "Google APIs Service Agent"
apiVersion: iam.cnrm.cloud.google.com/v1beta1
kind: IAMPolicyMember
metadata:
  name: gke-platform-project-service-agent-user
  namespace: projects
  annotations:
    cnrm.cloud.google.com/project-id: admin-project
spec:
  member: "serviceAccount:service-123456789012@container-engine-robot.iam.gserviceaccount.com"
  role: roles/container.hostServiceAgentUser
  resourceRef:
    apiVersion: resourcemanager.cnrm.cloud.google.com/v1beta1
    kind: Project
    name: network-project
caieo commented 3 years ago

Hi @karlkfi, thank you for this suggestion. In the future, we definitely want to support referencing service accounts by name. However, for your request in particular, I can't think of a great way to look up service accounts based on project numbers without including some amount of templating/assumptions about the SA name. Can you provide some more information on the urgency of this enhancement request?

karlkfi commented 3 years ago

The service accounts that use project number are almost always generated by GCP services. The two listed, for example, are from gke and cloudservices. It would be nice to be able to lookup the service account being used by a GCP service and reference it that way, but I don't know if there's an API for that yet.

karlkfi commented 3 years ago

I've now found some other use cases where I would like to avoid interpolating the service account email. One idea would be to have a memberRef in IAMPolicyMember that allows referencing IAMServiceAccount resources.

Then we would just need a way to get a service account from a serviceusage Service, so we can use it as a memberRef. This could be done with a serviceRef in the IAMServiceAccount spec. The complication is it couldn't be managed by KCC. It would just a reference, like a Terraform data resource.

Another option would be to put a memberServiceRef in the IAMPolicyMember and do the lookup all in the controller.

karlkfi commented 3 years ago

I think the info we need is in here: gcloud services list --format json

karlkfi commented 3 years ago
$ gcloud services list  --format json | jq -r '.[] | select(.name == "projects/685587273718/services/container.googleapis.com") | .serviceAccounts[].email'
service-685587273718@container-engine-robot.iam.gserviceaccount.com
yashsaini77 commented 3 years ago

Hi @karlkfi, do you know if we can delete the automatically created default compute service account somehow with kcc

caieo commented 3 years ago

Hi @karlkfi, thank you for brainstorming some suggestions for us! We will discuss your ideas and update this thread for further clarification and updates.

@yashsaini77, currently, we do not support acquiring generated service accounts (part of what Karl is asking for). Therefore, ConfigConnector cannot delete automatically created default Compute service accounts, which is what you were asking for in #341.

lostick commented 3 years ago

@caieo do you guys plan to support this in the future? I'm facing the same issue where I can't create a Project and add it to the SharedVPC in one go, as the Project Number is not stored anywhere.

If we could have a way to lookup generated service accounts (such as xxxxxxxxxxxxxx@cloudservices.gserviceaccount.com or service-xxxxxxxxxxxxxx@container-engine-robot.iam.gserviceaccount.com) dynamically, it'd be great.

caieo commented 3 years ago

@lostick, we have plans to support this in the future, but don't have a solidified timeline yet (looking like sometime in Q3). We'll update this thread with more details as we finalize them!

DWSR commented 3 years ago

Hi @caieo,

Any word on the status of this? I'm running into the same situation as @lostick where adding things to a Shared VPC is a two-step process, which isn't great.

An experience like this would be really nice:

---
apiVersion: serviceusage.cnrm.cloud.google.com/v1beta1
kind: Service
metadata:
  name: container.googleapis.com
  annotations:
    cnrm.cloud.google.com/project-id: my-cool-project
---
apiVersion: resourcemanager.cnrm.cloud.google.com/v1beta1
kind: Project
metadata:
  name: my-cool-project
# elided for brevity
---
apiVersion: resourcemanager.cnrm.cloud.google.com/v1beta1
kind: Project
metadata:
name: my-cool-project
# elided for brevity
---
apiVersion: iam.cnrm.cloud.google.com/v1beta2
kind: IAMPolicyMember
metadata:
name: shared-vpc-container-grant
spec:
  memberFrom:
    serviceRef:
      name: container.googleapis.com
  role: roles/owner
  resourceRef:
    apiVersion: resourcemanager.cnrm.cloud.google.com/v1beta1
    kind: Project
    name: my-shared-vpc-project
caieo commented 3 years ago

Hi @DWSR, thanks for chiming in. The experience you described aligns with what we are hoping to support in the future. Unfortunately, we still don't have a solid timeline, but we are looking into supporting this sometime in Q3/before October.

bharathkkb commented 2 years ago

Hi @caieo Any updates on native support for this in KCC?

jtcressy commented 2 years ago

Sorry for another ping @caieo but would it be possible to find any workaround for this? I see further up in the thread it's not possible for KCC to acquire generated service accounts like this. Would it be possible to simply replace the generated accounts with something that KCC could create (and then reference) that the api service (such as container-engine-robot) could use? I know this can be done with the compute engine service account, but it's not clear whether this can be done with others.

The inability to reference project numbers or api-service account emails effectively blocks me from repeatedly and cleanly applying yaml for vpc service projects in which I want to run GKE. And for GitOps flows it is generally bad that we have to create two pull requests - one to generate projects, and another to manually insert project numbers.

caieo commented 2 years ago

@jtcressy one workaround that exists is to use kpt's apply-time-mutation function which @karlkfi created an example for here: https://github.com/GoogleContainerTools/kpt-functions-catalog/tree/de1bf5422f697155c22262fa78c75e228ae396f5/contrib/examples/annotate-apply-time-mutations-custom-resource (this example specifically uses the container-engine-robot service account too)

@bharathkkb, unfortunately, we do not have any updates on this feature. We ran into a couple blockers that caused us to delay prioritizing this feature. There is an existing workaround through kpt, as mentioned above, please take a look at that if you need to be unblocked. Please reach out to us via GCP support to prioritize your request if needed.

jtcressy commented 2 years ago

@jtcressy one workaround that exists is to use kpt's apply-time-mutation function which @karlkfi created an example for here: https://github.com/GoogleContainerTools/kpt-functions-catalog/tree/de1bf5422f697155c22262fa78c75e228ae396f5/contrib/examples/annotate-apply-time-mutations-custom-resource (this example specifically uses the container-engine-robot service account too)

@caieo that kpt apply-time-mutation worked fantastically for me, thank you. However a small problem persists as I am using GCP Config Controller, which manages a GKE cluster dedicated for KCC and includes Config Sync, Policy Controller, etc... I'm noticing what seems to be a flapping scenario with the config sync root reconciler.

Please let me know if there's a good place to file an issue to describe this in more detail (as it's not specific to KCC) but here's the gist:

Expand me Initial creation of a resource with `config.kubernetes.io/apply-time-mutation` annotation succeeds and the intended mutations are made. Meanwhile, logs from `config-management-system/root-reconciler-559d7469f8-qwbgt:reconciler` indicate that it is trying to change an immutable field (which we originally modified with the apply-time-mutation) : ``` KNV2010: unable to update resource: admission webhook "deny-immutable-field-updates.cnrm.cloud.google.com" denied the request: the IAMPolicyMember's spec is immutable ``` If I manually delete the resource and allow it to be recreated by root-reconciler, it is recreated _without_ the apply-time-mutations specified in the annotation. It seems this root-reconciler is not using `kpt live apply` to ensure resources match desired state from git. The current workaround for me to get out of this situation is to scale root-reconciler to 0, delete the resources, then scale root-reconciler to 1 as there seems to be a different sidecar container that _does_ run `kpt live apply` and correctly performs the mutations before root-reconciler attempts to override it.
karlkfi commented 2 years ago

Config Sync does not work with apply-time-mutation yet. The remediator will fight with the applier in the reconciler, because while the applier supports it, the remediator does not.

You can use apply-time-mutation with kpt, but not if you also manage the object with Config Sync.

We plan to fix this, but it’s been punted for several months not due to other priorities. So I don’t have an ETA. It will likely require significant change to the remediator.

karlkfi commented 1 year ago

apply-time mutation still isn't supported in Config Sync yet, but Config Connector has added a new resource called IAMPartialPolicy, which allows referencing IAMServiceAccounts, so you don't need to dynamically generate the service account email address for most common IAM cases: https://cloud.google.com/config-connector/docs/reference/resource-docs/iam/iampartialpolicy#pubsub_admin_policy

diviner524 commented 1 year ago

Some updates on this topic since the original issue was opened more than two years ago.

We have introduced the ServiceIdentity resource and the serviceIdentityRef field in IAMPartialPolicy/IAMPolicyMember, which should be able to address most use cases mentioned in this scenario.

Please give it a try and let us know if there are still any limitations.

cehoffman commented 1 year ago

@diviner524 I'm using this for the container.googleapis.com service, but what is the method to get a reference to the <project number>@cloudservices.gserviceaccount.com. As far as I can tell, this is a foundational service account that comes with project creation and is not something enabled through service usage APIs.

diviner524 commented 1 year ago

@cehoffman This may belong to compute.googleapis.com.

There is a known issue that service identity does not populate email for compute.googleapis.com, and it is being looked into by the GCP Core Compute team.

karlkfi commented 1 year ago

Yeah, we're still missing a project identity solution. Each project has its own default service account. Most GCP services have moved to using service identities, but some still use the project identity.

One remaining notable usage of the project identity is when trying to authorize projects to use a shared VPC in another project.

cehoffman commented 1 year ago

One remaining notable usage of the project identity is when trying to authorize projects to use a shared VPC in another project.

This is exactly the current issue preventing us from successfully using CC in replacement of terraform, at least for project management. Look forward to improvements in this area. Thanks for the clarity.

snuggie12 commented 11 months ago

I think I'm in the same boat as everyone else trying to deploy something like a GKE cluster with shared VPC and this default account blocks it. Does it make sense to make a separate ticket since I don't believe this is a service agent?

Also, while looking at a random gist I did see there was a cloudapis.googleapis.com which returns back a different error. I figured it's a long shot but thought I'd mention it.

      Update call failed: error applying desired state: summary: Error creating Service Identity: googleapi: Error 400: Service cloudapis.googleapis.com has not been configured for service identities.
      Help Token: REDACTED
      Details:
      [
        {
          "@type": "type.googleapis.com/google.rpc.PreconditionFailure",
          "violations": [
            {
              "subject": "REDACTED",
              "type": "googleapis.com"
            }
          ]
        },
        {
          "@type": "type.googleapis.com/google.rpc.ErrorInfo",
          "domain": "serviceusage.googleapis.com",
          "reason": "SU_INTERNAL_GENERATE_SERVICE_IDENTITY"
        }
      ]
    reason: UpdateFailed
    status: "False"
    type: Ready