Open jinxingwang opened 4 years ago
@liyinan926 What do you think?
That sounds like a good proposal to me. Can you put together a doc on the details? Thanks!
new spark operator flag: --Sparkapplication-default-template
, --Sparkapplication-overwrite-template
(naming still pending)
Each flag will accept a configmap as input value.
e.g. --Sparkapplication-default-template=/namespace/configmapName
The content of the configMap should be a sparkapplication spec.
We can decide fields that is allowed here:
e.g.
apiVersion: v1
kind: ConfigMap
metadata:
name: configmapName
namespace: namespace
data:
spec: |
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: ScheduledSparkApplication
metadata:
name: spark-pi-scheduled
namespace: default
spec:
schedule: "@every 5m"
concurrencyPolicy: Allow
successfulRunHistoryLimit: 1
failedRunHistoryLimit: 3
template:
type: Scala
mode: cluster
image: gcr.io/spark/spark:v3.0.0
mainClass: org.apache.spark.examples.SparkPi
mainApplicationFile: local:///opt/spark/examples/jars/spark-examples_2.12-2.3.0.jar
driver:
cores: 1
memory: 512m
executor:
cores: 1
instances: 1
memory: 512m
restartPolicy:
type: Never
Then the sparkOperator will take this spec, and use as the default value for every submitted sparkapplication.
--Sparkapplication-overwrite-template=/namespace/configmapName
This flag will be mostly the same, but instead it will overwrite from the input spec to user's submitted sparkapplication.
For lists such as metadata
sparkconf
etc in the sparkapplication spec. It should do append on the list. and default/overwrite the keys in the list.
@jinxingwang thanks for the proposal. Can you elaborate a bit more on the flag sparkapplication-overwrite-template
? What is the relationship between it and sparkapplication-default-template
?
Also have you considered an alternative approach originally discussed in https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/pull/661?
@liyinan926 Thanks for the feedback. The relationship between these 2 flags is:
sparkapplication-default-template
is less restricted. It is for the cluster operators to recommend sparkApplication setup for the user if not set. for example: if I want to collect user metrics, I can put a default Prometheus sidecar config in the sparkapplication-default-template
then if the users don't have a setup for monitoring
it will use our recommended way. If users set this themself, they will be using their setup.sparkapplication-overwrite-template
is more strict, it will overwrite the user's sparkapplcaiton if you put def in the template. for example, if we want to make sure no pod is running as root in our cluster. I will put runAsNonRoot
in the securityContext
, in this case, doesn't matter what the user puts in the securityContext
, it will add runAsNonRoot
, so as a system operator, I know no container is running with potential vulnerability.sparkapplication-overwrite-template
. I have looked at #661, It is more like the sparkapplication-overwrite-template
I proposed:
sparkApplicationClass: pyspark2
. The argument is if the user cares enough to get the sparkApplicationClass: pyspark2
pyspark2
itself understood enough, they can just set it on their spec.
@liyinan926 Hi.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I am interested to contribute something similar to the spark pod template for Sparkapplication. It can be super useful as a service management standpoint. For example: 1: If I want to require all my users to install a Prometheus sidecar so I can collect metrics and monitor them without actively checking every submitted Sparkapplication spec nor team user how to set a Sparkapplication for Prometheus sidecar. 2: Some teams may want to enforce securityContext for every running pod in the cluster for security reasons. 3: adding some pod annotations for service management.
It is possible to write your own webhook to mutate or validate in order to achieve the above use cases. But why not provide this feature within spark operator.
I can think of 2 kinds of templates. 1: default Sparkapplication templates, default if it is not set, or merge if it is a list in the spec. 2: enforce Sparkapplication templates, overwrite user's Sparkapplication.