Closed beatgeek closed 3 years ago
Hey @beatgeek,
Thanks for raising this. Do you have a spark
service account in your namespace?
That notebook is used over here: https://github.com/kubeflow/manifests/pull/1733
Please note the ClusterRole creation that is necessary. I'm not sure if that is related to the problem you are experiencing.
Yes I do have a spark
Service Account in my namespace.
I did see the settings mentioned in Manifest #1733
I'm not clear on the instructions: kubectl edit clusterrolebinding spark-operatorsparkoperator-crb
. There is no such crb.
I've created two sets of rolebindings for spark and one for the operator.
apiVersion: v1
kind: ServiceAccount
metadata:
name: spark
namespace: my-nms
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: my-nms
name: spark-role
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["*"]
- apiGroups: [""]
resources: ["services"]
verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: spark-role-binding
namespace: my-nms
subjects:
- kind: ServiceAccount
name: spark
namespace: my-nms
roleRef:
kind: Role
name: spark-role
apiGroup: rbac.authorization.k8s.io
and
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: sparkop-release-spark-operator
namespace: spark-operator
rules:
- apiGroups: ["sparkoperator.k8s.io"]
resources: ["sparkapplications"]
verbs: ["create", "delete", "deletecollection", "get", "list", "update", "watch", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
name: sparkop-release-spark-operator
namespace: spark-operator
roleRef:
kind: Role
name: sparkop-release-spark-operator
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: default-editor
namespace: my-nms
- kind: ServiceAccount
name: spark
namespace: my-nms
I was able to change the connection to a Dataproc cluster and get it to run. It does, however, seem to have an issue with some syntax in the python job.
Update here - when I used spark-launcher="dataproc"
I've having success but seeing a Sparkjob failure coming from the job code.
Job failed with message [SyntaxError: invalid syntax]
The example shows python3.7 and the out-of-the-box notebook server runs python3.6. I'll confirm but this is likely the issue.
@beatgeek I am seeing similar issue. When I looked at the spark driver logs. Its complaining about "projectId" Looking at the stack trace, it seems like the spark driver pod is trying to create path under storage bucket and throwing error saying "projectid cannot be null"
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Expected Behavior
Running this line from the feast-kubeflow notebook:
output_file_uri = job.get_output_file_uri()
Expect job to runCurrent Behavior
Steps to reproduce
Following this notebook - Feast on Kubeflow Notebook
Client settings:
Specifications
Possible Solution
I've update RBAC permissions so the job seems to be set. I suspect this is still a permissions issue.