JahstreetOrg / spark-on-kubernetes-helm

Spark on Kubernetes infrastructure Helm charts repo
Apache License 2.0
199 stars 76 forks source link

Set tolerations for spark drivers and executors #28

Closed duongnt closed 4 years ago

duongnt commented 4 years ago

First of all, thank you @jahstreet for your amazing work here! Hope your diffs will be merged into livy soon.

In our usage, we recently have a need to use a dedicated nodepool for spark drivers and executors pods. We have this node pool set up with labels and taints, however we don't know how to add tolerations to the drivers and executors pods from livy. Looks like with spark 3.0.0 this is possible, but not in 2.4.5?

Not sure if the image you built have this feature?

jahstreet commented 4 years ago

Hi @duongnt , thank you for being interested in this project. Yes, my latest images support only Spark v2.4.5 which in order doesn't have the config options you're interested in. Also Apache Livy itself is not yet ready for Spark 3 since it is built with Scala 2.11 (Spark 3 min Scala version is 2.12, which is not compatible). The Apache Livy support for Spark 3 is going to be introduced in this PR. Once it is merged I'm going to add the support to my images and charts.

What is currently available for your use case is config option spark.kubernetes.node.selector.[labelKey], which can be set via Livy envs on chart installation (--set env.LIVY_SPARK_KUBERNETES_NODE_SELECTOR_[LABELKEY]=[label_value]) or per job when submitting it via POST request body ({ ... "conf": { "spark.kubernetes.node.selector.[labelKey]": "[label_value]", ... } ... }). Ref:

duongnt commented 4 years ago

Thanks @jahstreet. I found the node_selector option, but that alone doesn't prevent other workloads to jump on that nodepool. I guess I'll have to find other ways to do this while waiting for that PR to be merged.

jahstreet commented 4 years ago

Right, also you can always patch the required part of Spark Kubernetes RM, but I don't think it is smth acceptable for your case. Please share your experience if you manage to solve that.

duongnt commented 4 years ago

I managed to solve this with an unexpected tool: argo-events.

Basically I need to watch for pod creation events for pod with name prefixed with livy-session-, which are unschedulable because we set the node selector before, and then add tolerations. We could've done this with a MutatingAdmissionWebhook, but I'd rather write a couple lines of YAML than writing a new webhook in go. argo-events allows listening to k8s resources creation events, and making patches to those resources, which was all that I needed.

jahstreet commented 4 years ago

Wow, sounds really cool. Thank you for sharing! 🚀

duongnt commented 4 years ago

This is the argo-events yaml i needed to write:

kind: EventSource
  name: livy-pod
  type: resource
      namespace: livy
      group: ""
      version: v1
      resource: pods
        - ADD
        prefix: livy-session-
apiVersion: argoproj.io/v1alpha1
kind: Gateway
  name: livy-pod
  type: resource
    name: livy-pod
    serviceAccountName: argo-events-sa
      - "http://livy-pod-sensor.argo-events:9300/"
apiVersion: argoproj.io/v1alpha1
kind: Sensor
  name: livy-pod
    serviceAccountName: argo-events-sa
    # TODO: Setup subscription over NATS for better performance?
      port: 9300
    - name: livy-pod-event-dep
      gatewayName: livy-pod
      eventName: livy-pod
    - template:
        name: patch-pod
          group: ""
          version: v1
          resource: pods
          operation: patch
          patchStrategy: "application/strategic-merge-patch+json"
              apiVersion: v1
              kind: Pod
                name: pod_name
                namespace: livy
                  "patched-by-argo-events": "true"
                - key: "livy-only"
                  operator: "Equal"
                  value: "true"
                  effect: "NoSchedule"
                containers: []
            - src:
                dependencyName: livy-pod-event-dep
                dataTemplate: "{{ .Input.body.metadata.name }}"
              dest: metadata.name