Closed dalbhanj closed 4 years ago
@PatrickXYS Can you help check why kubeflow-edit
role doesn't work now? Where's the kubeflow-kubernetes-edit
from?
This looks like a regression issue. In v0.7.1 at least, kubeflow-edit
binds to default-edit
and we don't see this problem.
@Jeffwan From this clusterrole manifest, the kubeflow-kubernetes-edit
aggregate to kubeflow-edit
clusterrole. And kubeflow-edit
has rolebinding with serviceaccount default-edit
. That's why editing kubeflow-kubernetes-edit
works.
Besides that, I think editing kubeflow-edit
also works, but the problem here is we can't directly edit kubeflow-edit
clusterrole with:
kubectl edit clusterrole kubeflow-edit -n eksworkshop
After we edited, the rule we edit can't persist when we check it.
I'll investigate why we can't directly edit kubeflow-edit
clusterrole.
@PatrickXYS
Then it make sense
Note: If you edit that ClusterRole, your changes will be overwritten on API server restart via auto-reconciliation. To avoid that overwriting, either do not manually edit the role, or disable auto-reconciliation.
kubeflow-edit
is a clusterRole with aggregationRules, that's the reason the change can not be persisted. Instead, we have to change kubeflow-kubernetes-edit
to persist the changes, It will be aggregate to kubeflow-edit
.
There're two action items
kubeflow-kubernetes-edit
insteadkubeflow-kubernetes-edit
.@Jeffwan
Right, but instead of editing clusterrole, I think it should be more efficient to create a clusterrolebinding with cluster-admin
:
kubectl create clusterrolebinding mpisa --clusterrole=cluster-admin --serviceaccount=$namespace:default-editor
Besides that, I'll add a logic to find current k8s namespace:
def get_current_k8s_namespace():
"""Get the current namespace of kubernetes."""
with open('/var/run/secrets/kubernetes.io/serviceaccount/namespace', 'r') as f:
return f.readline()
@Jeffwan
Right, but instead of editing clusterrole, I think it should be more efficient to create a clusterrolebinding with
cluster-admin
:kubectl create clusterrolebinding mpisa --clusterrole=cluster-admin --serviceaccount=$namespace:default-editor
There's security issue, if you bind cluter-admin, that user have all the access and can even delete kubeflow namespace. Why do you want to give wide permission to non-admin user?
Besides that, I'll add a logic to find current k8s namespace:
def get_current_k8s_namespace(): """Get the current namespace of kubernetes.""" with open('/var/run/secrets/kubernetes.io/serviceaccount/namespace', 'r') as f: return f.readline()
What's the motivation? If we don't specify namespace, then sources we create will be inside user's namespace by default.
@Jeffwan Right, but instead of editing clusterrole, I think it should be more efficient to create a clusterrolebinding with
cluster-admin
:kubectl create clusterrolebinding mpisa --clusterrole=cluster-admin --serviceaccount=$namespace:default-editor
There's security issue, if you bind cluter-admin, that user have all the access and can even delete kubeflow namespace. Why do you want to give wide permission to non-admin user?
Okay, I was thinking this way saves users without manually editing clusterrole, and I can delete rolebinding at the end of the notebook. Given security issue we shouldn't do this.
Besides that, I'll add a logic to find current k8s namespace:
def get_current_k8s_namespace(): """Get the current namespace of kubernetes.""" with open('/var/run/secrets/kubernetes.io/serviceaccount/namespace', 'r') as f: return f.readline()
What's the motivation? If we don't specify namespace, then sources we create will be inside user's namespace by default.
The reason I want to find current k8s namespace is: serviceaccount argument requires <namespace>:<name>
format.
Let's do it by the way editing kubeflow-kubernetes-edit
, I'll file a PR for updating all the required modification.
Besides that, I'll see if I need to file a PR on upstream to update kubeflow-kubernetes-edit
Yes. @PatrickXYS I understand your concern. I know limiting user's permission is kind of painful for some of the cases. But this is the right way to go. We need to find the gaps to real multi-user kubeflow cluster.
If we use something like cluster-admin
, we hide all the problems. If someone ask for restrict settings, we have to start over.
That's same like we use Adminstrator
to deploy kubeflow, then user asks minimum IAM policies.
I investigate upstream kubeflow/mpi-operator
repo and found the reason we need to have edited the kubeflow-kubernetes-edit
is that mpi-operator
clusterrole hasn't been aggregated with kubeflow-edit
. Link
An example here is tf-operator
, link:
it creates a clusterrole which aggregate to kubeflow-edit
clusterrole
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: kubeflow-tfjobs-edit
labels:
rbac.authorization.kubeflow.org/aggregate-to-kubeflow-edit: "true"
rbac.authorization.kubeflow.org/aggregate-to-kubeflow-tfjobs-admin: "true"
rules:
- apiGroups:
- kubeflow.org
resources:
- tfjobs
- tfjobs/status
verbs:
- get
- list
- watch
- create
- delete
- deletecollection
- patch
- update
That's why we don't have to edit clusterrole to create tfjobs
resource given this notebook.
IMO, I think mpi-operator
should enable the same aggregated rule as tf-operator
. WDYT @Jeffwan ?
@PatrickXYS There would be a few components like that. Project owners may not know this. Do you want to submit a PR to https://github.com/kubeflow/manifests?
Make sure mxnet-job also have it. I can not remember if xgboost-operator has manifest there. If exist, update it as well
@Jeffwan Sure, let me check and I'll ping you after I filed a PR.
Since the PR is merged, and we root cause the issue. Let's close for now.
xgboost-operator
doesn't have manifest. Thus, I'll file PR for mxnet-operator
and mpi-operator
to make the same change as tf-operator
.
I was following Distributed training tutorial (https://github.com/aws-samples/eks-kubeflow-workshop/blob/master/notebooks/03_Distributed_Training/03_02_Distributed_Training_MPI_Horovod.ipynb) and ran into issue with unable to persist edited role so that you can add recommended policies
How to reproduce: 1) Edit kubeflow-edit role and add policy in desired namespace kubectl edit clusterrole kubeflow-edit -n kubeflow
2) Check back to validate if the policy exist
Workaround solution: Here is the solution suggested by @PatrickXYS Edit kubeflow-kubernetes-edit instead and add recommended policy
kubectl edit clusterrole kubeflow-kubernetes-edit -n kubeflow