Closed duongnt closed 4 years ago
Hi @duongnt , thank you for being interested in this project.
Yes, my latest images support only Spark v2.4.5
which in order doesn't have the config options you're interested in. Also Apache Livy itself is not yet ready for Spark 3 since it is built with Scala 2.11 (Spark 3 min Scala version is 2.12, which is not compatible). The Apache Livy support for Spark 3 is going to be introduced in this PR. Once it is merged I'm going to add the support to my images and charts.
What is currently available for your use case is config option spark.kubernetes.node.selector.[labelKey]
, which can be set via Livy envs on chart installation (--set env.LIVY_SPARK_KUBERNETES_NODE_SELECTOR_[LABELKEY]=[label_value]
) or per job when submitting it via POST request body ({ ... "conf": { "spark.kubernetes.node.selector.[labelKey]": "[label_value]", ... } ... }).
Ref:
Thanks @jahstreet. I found the node_selector option, but that alone doesn't prevent other workloads to jump on that nodepool. I guess I'll have to find other ways to do this while waiting for that PR to be merged.
Right, also you can always patch the required part of Spark Kubernetes RM, but I don't think it is smth acceptable for your case. Please share your experience if you manage to solve that.
I managed to solve this with an unexpected tool: argo-events.
Basically I need to watch for pod creation events for pod with name prefixed with livy-session-
, which are unschedulable because we set the node selector before, and then add tolerations. We could've done this with a MutatingAdmissionWebhook, but I'd rather write a couple lines of YAML than writing a new webhook in go. argo-events allows listening to k8s resources creation events, and making patches to those resources, which was all that I needed.
Wow, sounds really cool. Thank you for sharing! 🚀
This is the argo-events yaml i needed to write:
kind: EventSource
metadata:
name: livy-pod
spec:
type: resource
resource:
livy-pod:
namespace: livy
group: ""
version: v1
resource: pods
eventTypes:
- ADD
filter:
prefix: livy-session-
---
apiVersion: argoproj.io/v1alpha1
kind: Gateway
metadata:
name: livy-pod
spec:
type: resource
eventSourceRef:
name: livy-pod
template:
serviceAccountName: argo-events-sa
subscribers:
http:
- "http://livy-pod-sensor.argo-events:9300/"
---
apiVersion: argoproj.io/v1alpha1
kind: Sensor
metadata:
name: livy-pod
spec:
template:
serviceAccountName: argo-events-sa
subscription:
# TODO: Setup subscription over NATS for better performance?
http:
port: 9300
dependencies:
- name: livy-pod-event-dep
gatewayName: livy-pod
eventName: livy-pod
triggers:
- template:
name: patch-pod
k8s:
group: ""
version: v1
resource: pods
operation: patch
patchStrategy: "application/strategic-merge-patch+json"
source:
resource:
apiVersion: v1
kind: Pod
metadata:
name: pod_name
namespace: livy
labels:
"patched-by-argo-events": "true"
spec:
tolerations:
- key: "livy-only"
operator: "Equal"
value: "true"
effect: "NoSchedule"
containers: []
parameters:
- src:
dependencyName: livy-pod-event-dep
dataTemplate: "{{ .Input.body.metadata.name }}"
dest: metadata.name
First of all, thank you @jahstreet for your amazing work here! Hope your diffs will be merged into livy soon.
In our usage, we recently have a need to use a dedicated nodepool for spark drivers and executors pods. We have this node pool set up with labels and taints, however we don't know how to add tolerations to the drivers and executors pods from livy. Looks like with spark 3.0.0 this is possible, but not in 2.4.5?
Not sure if the image you built have this feature?