kubeflow / spark-operator

Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Apache License 2.0
2.78k stars 1.38k forks source link

Driver & Executor Env variables are not invoked by custom resource "SparkApplication" Application pods. #1640

Closed bharath-rajendran023 closed 1 month ago

bharath-rajendran023 commented 1 year ago

Title: SparkApplication deployment on Azure Kubernetes Services

Issue : A Custom Resource definition for the spark application deployed into AKS using this chart a few months ago. There is no issue with invoking the env variables defined in the spec section earlier, now suddenly the env section in both Driver & Executor is not working. Tried multiple ways like env, envFrom, valueFrom, but the values are not getting invoked in any of the ways.

Helm Chart Details: CHART : spark-operator-1.1.15 APP VERSION : v1beta2-1.3.1-3.1.1

bharath-rajendran023 commented 1 year ago

apiVersion: "sparkoperator.k8s.io/v1beta2" kind: ScheduledSparkApplication metadata: name: spark-test-app spec: schedule: " 5 " successfulRunHistoryLimit: 100 failedRunHistoryLimit: 100 concurrencyPolicy: Forbid template: type: Python pythonVersion: "3" mode: cluster image: "custom-image-repo" imagePullPolicy: Always imagePullSecrets: ["############"] mainApplicationFile: "local:///app.py" sparkVersion: "3.2.1" sparkConf: "spark.sql.storeAssignmentPolicy": "LEGACY" "spark.sql.legacy.timeParserPolicy": "LEGACY" restartPolicy: type: Never driver: envFrom:

bharath-rajendran023 commented 1 year ago

Please help to resolve this at the earliest

lethanhduong commented 1 year ago

In chart version 1.1.25, everything is working fine, you can try @bharath-rajendran023. I get the same issue with the latest version.

Issue:

The environment variable (env) of SparkApplication isn't rendered to SparkPod (driver, executor).

  env:
    - name: DEPS
      value: #######
    - name: SENTRY_DSN
      value: https://####
    - name: ENV
      value: prd

But, envSecretKeyRefs is working fine?

Information:

Kubernetes version: v1.22

Helm Chart Details:

SparkApplication Manifest

apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
  generation: 1
  name: bf-r2a-appdw-dimusere43a6
  namespace: data-spark
spec:
  driver:
    coreLimit: 300m
    coreRequest: 150m
    env:
    - name: DEPS
      value: #######
    - name: SENTRY_DSN
      value: https://####
    - name: ENV
      value: prd
    envSecretKeyRefs:
      MONGO_URI:
        key: MONGO_URI
        name: spark-env-secrets
    javaOptions: -XX:+UseCompressedOops
    labels:
      version: 3.3.0

Driver Pod

apiVersion: v1
kind: Pod
metadata:
  name: bf-r2a-appdw-dimusere43a6-driver
  namespace: data-spark
spec:
  containers:
  - env:
    - name: SPARK_USER
      value: root
    - name: SPARK_APPLICATION_ID
      value: spark-aef34e0bb55b49f08ababf6027cca79c
    - name: SPARK_DRIVER_BIND_ADDRESS
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: status.podIP
    - name: MONGO_URI
      valueFrom:
        secretKeyRef:
          key: MONGO_URI
          name: spark-env-secrets
    - name: PYSPARK_PYTHON
      value: python3
    - name: PYSPARK_DRIVER_PYTHON
      value: python3
    - name: SPARK_LOCAL_DIRS
      value: /var/data/spark-8b9acb38-fd61-4664-b82d-26343957e1db
    - name: SPARK_CONF_DIR
      value: /opt/spark/conf
    - name: AWS_STS_REGIONAL_ENDPOINTS
      value: regional
    - name: AWS_DEFAULT_REGION
      value: us-east-1
    - name: AWS_REGION
      value: us-east-1
    - name: AWS_ROLE_ARN
      value: ##############
    - name: AWS_WEB_IDENTITY_TOKEN_FILE
      value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
    image: ltduong/spark-py:v3.3.0-hadoop3-amd
    imagePullPolicy: IfNotPresent
    name: spark-kubernetes-driver
zhaohc10 commented 1 year ago

we got similar issue on this when run it on new Kubernetes cluster, any update on this ?

yuchaoran2011 commented 1 year ago

@bharath-rajendran023 If your environment is k8s 1.22, make sure your build of Spark operator includes this commit: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/pull/1421. I observed the same symptom as what you reported. And that commit resolved the issue for me

dilverse commented 1 year ago

I ran into the same problem, digging deeper into the issue looks like webhook needs to be enabled for the environment variables to get picked up.

You can take a look at this doc

If you are using Helm Charts to install the spark-operator you can try this

helm install my-release spark-operator/spark-operator --namespace spark-operator --set webhook.enable=true as described in the doc here

Elsayed91 commented 1 year ago

I have this issue as well, funnily enough, the format

    envVars:
      URI: "{{ params.URI }}"

works no problem but env and envFrom do not work. I do have webhook enabled and I am running 1.3.8-3.1.1

I am using the manifest with the webhook, not helm chart, but everything seems to be setup. I don't know. envvars are just too ugly man.

Edit: for me it seems that since I was not using the service that comes with the chart, the webhook was not getting picked up. i still didnt use the chart, but I added some labels to my service and now it is working fine.

allenhaozi commented 1 year ago

enable webhook

webhook:
            enable: true
arunalakmal commented 1 year ago

We had the similar problem with OpenShift and enabling webhook worked out for us too. This was our value snippet for webhook.

webhook:
  cleanupAnnotations:
    helm.sh/hook: pre-delete, pre-upgrade
    helm.sh/hook-delete-policy: hook-succeeded
  enable: true

Then ENVs appeared with this notification.

Screenshot 2023-02-02 at 09 14 43
github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 1 month ago

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.