Open Bobgy opened 4 years ago
Issue-Label Bot is automatically applying the labels:
Label | Probability |
---|---|
kind/bug | 0.82 |
Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback! Links: app homepage, dashboard and code for this bot.
/assign @Bobgy
@Bobgy Isn't the profiles-manager already bound to a GSA?
I looked at one of the auto deployments
It looks like profiles manager is using KSA: serviceAccount: profiles-controller-service-account
Which doesn't have a GSA
kubectl -n kubeflow get serviceAccount -o yaml profiles-controller-service-account
apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{},"labels":{"kustomize.component":"profiles"},"name":"profiles-controller-service-account","namespace":"kubeflow"}}
creationTimestamp: "2020-06-23T11:45:21Z"
labels:
kustomize.component: profiles
name: profiles-controller-service-account
namespace: kubeflow
resourceVersion: "3510"
selfLink: /api/v1/namespaces/kubeflow/serviceaccounts/profiles-controller-service-account
uid: 04de24e6-b547-11ea-8696-42010a800149
secrets:
- name: profiles-controller-service-account-token-bvcxp
I think this is a bug. If the profiles manager isn't bound to any GSA it won't be able to attach GSA's to KSAs in newly provisioned namespaces.
Looking at a v1 deployment it looks like it is bound to the admin account
kubectl --context=kf-v1-0210 -n kubeflow get serviceaccount -o yaml profiles-controller-service-account
apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
iam.gke.io/gcp-service-account: kf-v1-0210-admin@jlewi-dev.iam.gserviceaccount.com
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{},"labels":{"app.kubernetes.io/component":"profiles","app.kubernetes.io/instance":"profiles-v1.0.0","app.kubernetes.io/managed-by":"kfctl","app.kubernetes.io/name":"profiles","app.kubernetes.io/part-of":"kubeflow","app.kubernetes.io/version":"v1.0.0","kustomize.component":"profiles"},"name":"profiles-controller-service-account","namespace":"kubeflow"}}
creationTimestamp: "2020-02-10T19:12:36Z"
labels:
app.kubernetes.io/component: profiles
app.kubernetes.io/instance: profiles-v1.0.0
app.kubernetes.io/managed-by: kfctl
app.kubernetes.io/name: profiles
app.kubernetes.io/part-of: kubeflow
app.kubernetes.io/version: v1.0.0
kustomize.component: profiles
name: profiles-controller-service-account
namespace: kubeflow
resourceVersion: "5691"
selfLink: /api/v1/namespaces/kubeflow/serviceaccounts/profiles-controller-service-account
uid: 4c172377-4c39-11ea-86b4-42010a8e01a3
secrets:
- name: profiles-controller-service-account-token-rxm4l
that's true, It was originaly bound by kfctl
We need to bind It by kcc now
Thanks @Bobgy that makes sense; thanks for reporting and fixing this
@Bobgy Lets not close this issue until
@Bobgy Are we going to have to apply the same change as #64 to the auto-deploy script.
Issue-Label Bot is automatically applying the labels:
Label | Probability |
---|---|
area/kfctl | 0.64 |
Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback! Links: app homepage, dashboard and code for this bot.
The most recent deploy "kf-vbp-0629-571" runs pipelines successfully, but workload identity binding is still not properly setup.
$ k get sa pipeline-runner -o yaml
apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
iam.gke.io/gcp-service-account: name-user@project-id.iam.gserviceaccount.com
name: pipeline-runner
namespace: kubeflow
...
I'm guessing the change https://github.com/kubeflow/testing/pull/705 didn't make it into the auto deploy. Is there a way we can trigger a manual deploy?
Looked at the tekton task https://kf-ci-v1.endpoints.kubeflow-ci.cloud.goog/tekton/#/namespaces/auto-deploy/pipelineruns/deploy-kf-master-p5sb2, it's using
params:
- name: revision
value: gcp_blueprints
- name: url
value: 'https://github.com/jlewi/testing.git'
type: git
@jlewi Can you fix this?
The current auto-deployment was deployed before kubeflow/testing#707 was merged.
gcp-blueprint-master kf-vbp-0629-3f5 8:28:38.534253 deploy-kf-master-rpgt4 7cfd7bc gcloud --project=kubeflow-ci-deployment container clusters get-credentials --region=us-central1 kf-vbp-0629-3f5
The next auto-deployment will hopefully have the fix.
Thanks for the heads up! I'll wait for the next one then
This is working properly now. I verified in https://kf-vbp-0630-8f5.endpoints.kubeflow-ci-deployment.cloud.goog/_/pipeline/#/runs/details/c7af3e03-87ab-4a6b-b30c-20d8b9a5a8ea, the tfx taxi sample ran successfully.
I found a problem, default-editor
KSA in user namespaces didn't get correct workload identity binding.
Root cause: I used IAMPolicy
for workload identity bindings for kf-user
GSA.
However, in multi user mode, profile controller also tries to add additional workload identity bindings to kf-user
GSA.
Then, profile controller and KCC controller will periodically overwrite each other's IAM policies.
We should use IAMPolicyMember
instead.
@Bobgy Thanks; we will need to cherry-pick this into the 1.0 branch of manifests.
We should verify permissions of notebooks whether this is good in auto deploys before closing this.
I've found quite a few deployments don't work properly due to lack of GCP permissions, including:
/cc @jlewi
I think we should bind workload identity with these KSAs by default.
Proposal
Add workload identity binding for
default-editor will be automatically bound to user GSA when profiles manager works properly.
@jlewi I can implement this, thoughts?