Open jblankfeld opened 5 years ago
The Spark operator in its current form is well suited for cluster mode jobs, but not so well for interactive client mode apps, e.g., spark-shell
. We can make the operator create a pod running spark-shell
and a headless service for the executors to connect to the driver. Is this something you are interested?
Yes, this is exactly what I had in mind, this would be very handy. At the moment I'm going to go with a helm chart but the operator with all its configuration options would be great.
Question: how do you plan to use the pod running spark-shell
? Attaching into the container and run spark-shell
? I'm not sure what the entry point of the driver container will be in this case.
I am using spark-shell
command as entrypoint and I attach to the pod with kubectl attaach -it my-driver
The only downside is that I can not detach without stopping the spark driver.
Here is the helm template I use:
apiVersion: v1
kind: Pod
metadata:
name: {{ include "spark-shell-k8s.fullname" . }}
labels:
{{ include "spark-shell-k8s.labels" . | indent 4 }}
spec:
imagePullSecrets:
- name: {{ .Values.image.pullSecrets }}
restartPolicy: Never
serviceAccountName: {{ .Values.spark.serviceAccountName }}
automountServiceAccountToken: true
containers:
- name: {{ .Chart.Name }}
image: {{ .Values.image.repository }}:{{ .Values.image.tag }}
imagePullPolicy: {{ .Values.image.pullPolicy }}
tty: true
stdin: true
{{- if .Values.spark.persistentVolumeClaims }}
volumeMounts:
{{- range .Values.spark.persistentVolumeClaims }}
- name: "{{- toString . }}-volume"
mountPath: "/mnt/{{- toString . }}"
{{- end }}
{{- end }}
command:
- /opt/spark/bin/spark-shell
- --master
- k8s://https://$(KUBERNETES_SERVICE_HOST)
- --name
- {{ include "spark-shell-k8s.fullname" . }}
args:
- --conf
- spark.driver.host={{ include "spark-shell-k8s.fullname" . }}.{{ .Release.Namespace }}
- --conf
- spark.driver.port=7079
- --conf
- spark.executor.instances={{ .Values.spark.executor.instances }}
- --conf
- spark.executor.cores={{ .Values.spark.executor.cores }}
- --conf
- spark.kubernetes.namespace={{ .Release.Namespace }}
- --conf
- spark.kubernetes.container.image={{ .Values.image.repository }}:{{ .Values.image.tag }}
- --conf
- spark.kubernetes.container.image.pullPolicy={{ .Values.image.pullPolicy }}
- --conf
- spark.kubernetes.container.image.pullSecrets={{ .Values.image.pullSecrets }}
- --conf
- spark.kubernetes.driver.pod.name={{ include "spark-shell-k8s.fullname" . }}
{{- range .Values.spark.persistentVolumeClaims }}
- --conf
- spark.kubernetes.executor.volumes.persistentVolumeClaim.{{- toString . }}-volume.options.claimName={{- toString . }}
- --conf
- spark.kubernetes.executor.volumes.persistentVolumeClaim.{{- toString . }}-volume.mount.path=/mnt/{{- toString . }}
{{- end }}
ports:
- name: driver
protocol: TCP
containerPort: 7079
- name: ui
protocol: TCP
containerPort: 4040
{{- if .Values.spark.persistentVolumeClaims }}
volumes:
{{- range .Values.spark.persistentVolumeClaims }}
- name: "{{- toString . }}-volume"
persistentVolumeClaim:
claimName: "{{- toString . }}"
{{- end }}
{{- end }}
And there is an associated headless service on top of that.
Cool. We can definitely add support for this by creating a driver pod running the spark-shell
, and a headless service for the driver pod.
Any progress on this ? it will be good to have something like:
sparkctl shell --<pyspark|spark> <<drive_name>> -n <<namespace>>
Any progress on it?
any progress on the issue ?
@liyinan926 any progress on this issue?
Any progress on this issue?
Is there any progress now?
Any progress on this issue?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hi, I would like to know if it's currently possible to submit a spark-shell job using the spark operator. I tried to deploy a SparkApplication using
mainClass: org.apache.spark.repl.Main
but this does not work in client mode and I get:and in cluster mode, I get:
I think this should not be too difficult to achieve but a piece is missing here.
Thanks in advance.