kubeflow / spark-operator

Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Apache License 2.0
2.77k stars 1.38k forks source link

Sparkapplication template #1025

Open jinxingwang opened 4 years ago

jinxingwang commented 4 years ago

I am interested to contribute something similar to the spark pod template for Sparkapplication. It can be super useful as a service management standpoint. For example: 1: If I want to require all my users to install a Prometheus sidecar so I can collect metrics and monitor them without actively checking every submitted Sparkapplication spec nor team user how to set a Sparkapplication for Prometheus sidecar. 2: Some teams may want to enforce securityContext for every running pod in the cluster for security reasons. 3: adding some pod annotations for service management.

It is possible to write your own webhook to mutate or validate in order to achieve the above use cases. But why not provide this feature within spark operator.

I can think of 2 kinds of templates. 1: default Sparkapplication templates, default if it is not set, or merge if it is a list in the spec. 2: enforce Sparkapplication templates, overwrite user's Sparkapplication.

jinxingwang commented 4 years ago

@liyinan926 What do you think?

liyinan926 commented 4 years ago

That sounds like a good proposal to me. Can you put together a doc on the details? Thanks!

jinxingwang commented 4 years ago

new spark operator flag: --Sparkapplication-default-template, --Sparkapplication-overwrite-template (naming still pending) Each flag will accept a configmap as input value. e.g. --Sparkapplication-default-template=/namespace/configmapName The content of the configMap should be a sparkapplication spec. We can decide fields that is allowed here: e.g.

apiVersion: v1
kind: ConfigMap
metadata:
  name: configmapName
  namespace: namespace
data:
  spec: |
    apiVersion: "sparkoperator.k8s.io/v1beta2"
    kind: ScheduledSparkApplication
    metadata:
      name: spark-pi-scheduled
      namespace: default
    spec:
      schedule: "@every 5m"
      concurrencyPolicy: Allow
      successfulRunHistoryLimit: 1
      failedRunHistoryLimit: 3
      template:
        type: Scala
        mode: cluster
        image: gcr.io/spark/spark:v3.0.0
        mainClass: org.apache.spark.examples.SparkPi
        mainApplicationFile: local:///opt/spark/examples/jars/spark-examples_2.12-2.3.0.jar
        driver:
          cores: 1
          memory: 512m
        executor:
          cores: 1
          instances: 1
          memory: 512m
        restartPolicy:
          type: Never

Then the sparkOperator will take this spec, and use as the default value for every submitted sparkapplication.

--Sparkapplication-overwrite-template=/namespace/configmapName This flag will be mostly the same, but instead it will overwrite from the input spec to user's submitted sparkapplication. For lists such as metadata sparkconf etc in the sparkapplication spec. It should do append on the list. and default/overwrite the keys in the list.

liyinan926 commented 4 years ago

@jinxingwang thanks for the proposal. Can you elaborate a bit more on the flag sparkapplication-overwrite-template? What is the relationship between it and sparkapplication-default-template?

Also have you considered an alternative approach originally discussed in https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/pull/661?

jinxingwang commented 4 years ago

@liyinan926 Thanks for the feedback. The relationship between these 2 flags is:

  1. both are optional.
  2. sparkapplication-default-template is less restricted. It is for the cluster operators to recommend sparkApplication setup for the user if not set. for example: if I want to collect user metrics, I can put a default Prometheus sidecar config in the sparkapplication-default-template then if the users don't have a setup for monitoring it will use our recommended way. If users set this themself, they will be using their setup.
  3. in compression, sparkapplication-overwrite-template is more strict, it will overwrite the user's sparkapplcaiton if you put def in the template. for example, if we want to make sure no pod is running as root in our cluster. I will put runAsNonRoot in the securityContext, in this case, doesn't matter what the user puts in the securityContext, it will add runAsNonRoot, so as a system operator, I know no container is running with potential vulnerability.
  4. In case both templates have some same attributes set, it will result to sparkapplication-overwrite-template.

I have looked at #661, It is more like the sparkapplication-overwrite-template I proposed:

  1. 1 main feature I wanted it doesn't have is the enforcement, in the PR, the system operator doesn't have a way to enforce(default/overwrite) a good usage of sparkapplicaiton, if the user decided to not usesparkApplicationClass: pyspark2.
  2. Also introducing an additional layer of sparkapplicaiton spec feels a bit overkill.
  3. 1 pro, it can provide different default spec, users can choose from.
jinxingwang commented 4 years ago

The argument is if the user cares enough to get the sparkApplicationClass: pyspark2 pyspark2 itself understood enough, they can just set it on their spec.

jinxingwang commented 4 years ago

@liyinan926 Hi.

github-actions[bot] commented 5 days ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.