shikanon / kubeflow-manifests

kubeflow国内一键安装文件
GNU General Public License v3.0
337 stars 117 forks source link

mpijob error #72

Open sxl1993 opened 2 years ago

sxl1993 commented 2 years ago

error retrieving resource lock kubeflow/mpi-operator: endpoints "mpi-operator" is forbidden: User "system:serviceaccount:kubeflow:mpi-operator" cannot get resource "endpoints" in API group "" in the namespace "kubeflow"

shikanon commented 2 years ago

@seansxl 应该是 mpi 的 RBAC 安装出错了,你可以卸载重新安装,也可以 apply 这段来授权:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app: mpi-operator
    app.kubernetes.io/component: mpijob
    app.kubernetes.io/name: mpi-operator
    kustomize.component: mpi-operator
  name: mpi-operator
rules:
- apiGroups:
  - ""
  resources:
  - configmaps
  - serviceaccounts
  verbs:
  - create
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - pods/exec
  verbs:
  - create
- apiGroups:
  - ""
  resources:
  - endpoints
  verbs:
  - create
  - get
  - update
- apiGroups:
  - ""
  resources:
  - events
  verbs:
  - create
  - patch
- apiGroups:
  - rbac.authorization.k8s.io
  resources:
  - roles
  - rolebindings
  verbs:
  - create
  - list
  - watch
- apiGroups:
  - policy
  resources:
  - poddisruptionbudgets
  verbs:
  - create
  - list
  - update
  - watch
- apiGroups:
  - apps
  resources:
  - statefulsets
  verbs:
  - create
  - list
  - update
  - watch
- apiGroups:
  - batch
  resources:
  - jobs
  verbs:
  - create
  - list
  - update
  - watch
- apiGroups:
  - apiextensions.k8s.io
  resources:
  - customresourcedefinitions
  verbs:
  - create
  - get
- apiGroups:
  - kubeflow.org
  resources:
  - mpijobs
  - mpijobs/finalizers
  - mpijobs/status
  verbs:
  - '*'
- apiGroups:
  - scheduling.incubator.k8s.io
  - scheduling.sigs.dev
  resources:
  - queues
  - podgroups
  verbs:
  - '*'

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    app: mpi-operator
    app.kubernetes.io/component: mpijob
    app.kubernetes.io/name: mpi-operator
    kustomize.component: mpi-operator
  name: mpi-operator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: mpi-operator
subjects:
- kind: ServiceAccount
  name: mpi-operator
  namespace: kubeflow
sxl1993 commented 2 years ago

并没有解决问题