bitnami / containers

Bitnami container images
https://bitnami.com
Other
3.27k stars 4.76k forks source link

[SPARK] lang.NullPointerException: invalid null input: name #52698

Closed TimVerboisCgk closed 1 month ago

TimVerboisCgk commented 10 months ago

Name and Version

bitnami/spark:3.5.0-debian-11-r12

What architecture are you using?

amd64

What steps will reproduce the bug?

I used the bitnami helm chart to setup spark.

Then I started an example application from a client:

spark-submit \
    --class org.apache.spark.examples.SparkPi \
    --conf spark.kubernetes.container.image=docker.io/bitnami/spark:3.5.0-debian-11-r12 \
    --master k8s://https://<cluster-url> \
    --conf spark.kubernetes.driverEnv.SPARK_MASTER_URL=spark://<spark-cluster-url>:7077 \
    --deploy-mode cluster \
    --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
    --name spark-pi \
    --conf spark.kubernetes.executor.podNamePrefix=test \
    --conf spark.kubernetes.executor.request.cores=100m \
    --conf spark.kubernetes.executor.request.memory=1Gi \
    --conf spark.executor.instances=1 \
    local:///opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.0.jar

It starts the driver and after that he starts spinning up executors:

$ kubectl get pods
NAME                               READY   STATUS    RESTARTS      AGE
test-spark-master-0              1/1     Running   0             97m
spark-pi-bdf6ef8bba1672e6-driver   1/1     Running   0             9s
timtest-exec-1                     1/1     Running   0             3s

Now, the executor will crash over and over again, being replaced by "-2", "-3", ... untill he finally gives up and exits. The pods remain in Error state.

I was able to get this log from the executor pods:

23/11/10 16:06:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" org.apache.hadoop.security.KerberosAuthException: failure to login: javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid null input: name
    at jdk.security.auth/com.sun.security.auth.UnixPrincipal.<init>(UnixPrincipal.java:71)
    at jdk.security.auth/com.sun.security.auth.module.UnixLoginModule.login(UnixLoginModule.java:134)
    at java.base/javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
    at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:679)
    at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:677)
    at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
    at java.base/javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:677)
    at java.base/javax.security.auth.login.LoginContext.login(LoginContext.java:587)
    at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:2065)
    at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1975)
    at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:719)
    at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:669)
    at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:579)
    at org.apache.spark.deploy.SparkHadoopUtil.createSparkUser(SparkHadoopUtil.scala:70)
    at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:61)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend$.run(KubernetesExecutorBackend.scala:67)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend$.main(KubernetesExecutorBackend.scala:56)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend.main(KubernetesExecutorBackend.scala)

    at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1986)
    at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:719)
    at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:669)
    at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:579)
    at org.apache.spark.deploy.SparkHadoopUtil.createSparkUser(SparkHadoopUtil.scala:70)
    at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:61)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend$.run(KubernetesExecutorBackend.scala:67)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend$.main(KubernetesExecutorBackend.scala:56)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend.main(KubernetesExecutorBackend.scala)
Caused by: javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid null input: name
    at jdk.security.auth/com.sun.security.auth.UnixPrincipal.<init>(UnixPrincipal.java:71)
    at jdk.security.auth/com.sun.security.auth.module.UnixLoginModule.login(UnixLoginModule.java:134)
    at java.base/javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
    at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:679)
    at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:677)
    at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
    at java.base/javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:677)
    at java.base/javax.security.auth.login.LoginContext.login(LoginContext.java:587)
    at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:2065)
    at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1975)
    at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:719)
    at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:669)
    at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:579)
    at org.apache.spark.deploy.SparkHadoopUtil.createSparkUser(SparkHadoopUtil.scala:70)
    at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:61)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend$.run(KubernetesExecutorBackend.scala:67)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend$.main(KubernetesExecutorBackend.scala:56)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend.main(KubernetesExecutorBackend.scala)

    at java.base/javax.security.auth.login.LoginContext.invoke(LoginContext.java:850)
    at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:679)
    at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:677)
    at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
    at java.base/javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:677)
    at java.base/javax.security.auth.login.LoginContext.login(LoginContext.java:587)
    at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:2065)
    at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1975)
    ... 8 more

What is the expected behavior?

It should not fail, it should just start

What do you see instead?

Additional information

I figured out through research on the internet, that the executor script tries to find the name of the user and since the user is not defined in /etc/passwd, this will fail, resulting in an empty "name" string. This means that the java command will get an "empty name variable", resulting in the error:

lang.NullPointerException: invalid null input: name

A lot of users work around this by creating their own image, correcting the problem in a number of different ways, like add a line to /etc/passwd. Some even remove the "USER" in the dockerfile and create it as a root container (!!).

I can imagine you want this fixed at the source (the spark script itself), but this means that until this is fixed, a lot of people are overriding your image by fixing it themselves. Why don't you add a working workaround now and get it out when they fix it at the source?

carrodher commented 10 months ago

Thank you for bringing this issue to our attention. We appreciate your involvement! If you're interested in contributing a solution, we welcome you to create a pull request. The Bitnami team is excited to review your submission and offer feedback. You can find the contributing guidelines here.

Your contribution will greatly benefit the community. Feel free to reach out if you have any questions or need assistance.

TimVerboisCgk commented 10 months ago

Apparently, I made my issue in the wrong repository, so I created a PR in the Container repository: #52661.

Thank you for the help

carrodher commented 10 months ago

Thank you for opening this issue and submitting the associated Pull Request. Our team will review and provide feedback. Once the PR is merged, the issue will automatically close.

Your contribution is greatly appreciated!

github-actions[bot] commented 9 months ago

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

github-actions[bot] commented 9 months ago

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.

rafariossaa commented 7 months ago

Hi, I would like to try to reproduce the issue, could you indicate the params (or any other step) used to deploy the chart ?

TimVerboisCgk commented 7 months ago

Hi @rafariossaa,

thank you for the reply. I've set it up a while back and I'm pretty sure this is the values file:

values.TXT

I used for the helm chart.

rafariossaa commented 7 months ago

Thanks @TimVerboisCgk, I am continuing working on this.

rafariossaa commented 7 months ago

Hi, I am running this:

spark-submit \
    --class org.apache.spark.examples.SparkPi \
    --conf spark.kubernetes.container.image=docker.io/bitnami/spark:3.5.0-debian-11-r12 \
    --conf spark.kubernetes.driverEnv.SPARK_MASTER_URL=spark://spk-spark-master-svc:7077 \
    --deploy-mode cluster \
    --name spark-pi \
    --master k8s://https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT \
    --conf spark.kubernetes.authenticate.driver.serviceAccountName=spk-spark \
    --conf spark.kubernetes.executor.podNamePrefix=test \
    --conf spark.kubernetes.executor.request.cores=100m \
    --conf spark.kubernetes.executor.request.memory=1Gi \
    --conf spark.executor.instances=1 \
    --conf spark.kubernetes.driver.secrets.k8s-secret=/opt/bitnami/spark/tmp/secret-dir \
    --conf spark.kubernetes.executor.secrets.k8s-secret=/opt/bitnami/spark/tmp/secret-dir \
    --conf spark.kubernetes.authenticate.submission.caCertFile=/opt/bitnami/spark/tmp/secret-dir/k8s_pub.pem \
    local:///opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.0.jar

and a pod was created:

$ kubectl describe pod spark-pi-1769838d841c8e86-driver  
Name:             spark-pi-1769838d841c8e86-driver
Namespace:        default
Priority:         0
Service Account:  spk-spark
Node:             minikube/192.168.49.2
...

Containers:
  spark-kubernetes-driver:
    Container ID:  docker://c16bccd80985edcf2dad48c526633ca818876c2a9b1a9f3172912495150fc180
    Image:         docker.io/bitnami/spark:3.5.0-debian-11-r12
    Image ID:      docker-pullable://bitnami/spark@sha256:21ce8a386d1966ae560dbfaaa3d20dc030a0599db9a7ce6ab4de80226ee31cf5
    Ports:         7078/TCP, 7079/TCP, 4040/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP
    Args:
      driver
      --properties-file
      /opt/spark/conf/spark.properties
      --class
      org.apache.spark.examples.SparkPi
      local:///opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.0.jar
...
    Environment:
      SPARK_USER:                 spark
      SPARK_APPLICATION_ID:       spark-16b480df358240248e9b8a956bd0484c
      SPARK_MASTER_URL:           spark://spk-spark-master-svc:7077
      SPARK_DRIVER_BIND_ADDRESS:   (v1:status.podIP)
      SPARK_LOCAL_DIRS:           /var/data/spark-cb71a698-a5d8-40b3-85ff-c6d2275488e4
      SPARK_CONF_DIR:             /opt/spark/conf
    Mounts:
      /opt/bitnami/spark/tmp/secret-dir from k8s-secret-volume (rw)
      /opt/spark/conf from spark-conf-volume-driver (rw)
      /var/data/spark-cb71a698-a5d8-40b3-85ff-c6d2275488e4 from spark-local-dir-1 (rw)
...
Volumes:
  k8s-secret-volume:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  k8s-secret
    Optional:    false
  spark-local-dir-1:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  spark-conf-volume-driver:
    Type:        ConfigMap (a volume populated by a ConfigMap)
    Name:        spark-drv-ec67b88d841c92c6-conf-map
    Optional:    false
...

the spark-env.sh is mounted via configmap spark-drv-ec67b88d841c92c6-conf-map and this already contains the needed LD_PRELOAD value:

$ kubectl describe configmap spark-drv-ec67b88d841c92c6-conf-map 
Name:         spark-drv-ec67b88d841c92c6-conf-map
Namespace:    default
Labels:       <none>
Annotations:  <none>

Data
====
spark-env.sh:
----
LD_PRELOAD=/opt/bitnami/common/lib/libnss_wrapper.so

spark.kubernetes.namespace:
----
default
spark.properties:
----
#Java properties built from Kubernetes config map with name: spark-drv-ec67b88d841c92c6-conf-map
#Wed Feb 07 15:08:01 UTC 2024
spark.driver.port=7078
spark.kubernetes.driver.secrets.k8s-secret=/opt/bitnami/spark/tmp/secret-dir
spark.submit.pyFiles=
spark.kubernetes.executor.request.cores=100m
spark.kubernetes.driverEnv.SPARK_MASTER_URL=spark\://spk-spark-master-svc\:7077
spark.kubernetes.executor.secrets.k8s-secret=/opt/bitnami/spark/tmp/secret-dir
spark.kubernetes.resource.type=java
spark.app.submitTime=1707318480437
spark.kubernetes.submitInDriver=true
spark.kubernetes.executor.request.memory=1Gi
spark.kubernetes.authenticate.driver.serviceAccountName=spk-spark
spark.kubernetes.driver.pod.name=spark-pi-1769838d841c8e86-driver
spark.executor.instances=1
spark.master=k8s\://https\://10.96.0.1\:443
spark.app.name=spark-pi
spark.submit.deployMode=cluster
spark.driver.host=spark-pi-1769838d841c8e86-driver-svc.default.svc
spark.driver.blockManager.port=7079
spark.app.id=spark-16b480df358240248e9b8a956bd0484c
spark.kubernetes.container.image=docker.io/bitnami/spark\:3.5.0-debian-11-r12
spark.kubernetes.executor.podNamePrefix=test
spark.kubernetes.memoryOverheadFactor=0.1
spark.kubernetes.authenticate.submission.caCertFile=/opt/bitnami/spark/tmp/secret-dir/k8s_pub.pem
spark.jars=local\:///opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.0.jar
spark.driver.extraJavaOptions=--add-exports java.base/sun.nio.ch\=ALL-UNNAMED

BinaryData
====

in my case, the pod is using the default user ID (1001) to run it. In the values.txt you sent, the user in pod and container security context is set to 1001 (default value). I am not sure how are you getting the issue when the spark-env.sh file is created, does the cluster were you are running spark enforces userIDs somehow ?

TimVerboisCgk commented 7 months ago

Rafael,

Did the exec pods start? Because in my case they kept on crashing until they stopped (10 times I believe). And they will disappear.

Met vriendelijke groeten / Meilleures salutations / Best regards Tim Verbois System Engineer

From: Rafael Ríos Saavedra @.> Date: Thursday, 8 February 2024 at 15:19 To: bitnami/containers @.> Cc: Tim Verbois @.>, Mention @.> Subject: Re: [bitnami/containers] [SPARK] lang.NullPointerException: invalid null input: name (Issue #52698) U ontvangt niet vaak e-mail van @.*** Meer informatie over waarom dit belangrijk ishttps://aka.ms/LearnAboutSenderIdentification

Hi, I am running this:

spark-submit \

--class org.apache.spark.examples.SparkPi \

--conf spark.kubernetes.container.image=docker.io/bitnami/spark:3.5.0-debian-11-r12 \

--conf spark.kubernetes.driverEnv.SPARK_MASTER_URL=spark://spk-spark-master-svc:7077 \

--deploy-mode cluster \

--name spark-pi \

--master k8s://https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT \

--conf spark.kubernetes.authenticate.driver.serviceAccountName=spk-spark \

--conf spark.kubernetes.executor.podNamePrefix=test \

--conf spark.kubernetes.executor.request.cores=100m \

--conf spark.kubernetes.executor.request.memory=1Gi \

--conf spark.executor.instances=1 \

--conf spark.kubernetes.driver.secrets.k8s-secret=/opt/bitnami/spark/tmp/secret-dir \

--conf spark.kubernetes.executor.secrets.k8s-secret=/opt/bitnami/spark/tmp/secret-dir \

--conf spark.kubernetes.authenticate.submission.caCertFile=/opt/bitnami/spark/tmp/secret-dir/k8s_pub.pem \

local:///opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.0.jar

and a pod was created:

$ kubectl describe pod spark-pi-1769838d841c8e86-driver

Name: spark-pi-1769838d841c8e86-driver

Namespace: default

Priority: 0

Service Account: spk-spark

Node: minikube/192.168.49.2

...

Containers:

spark-kubernetes-driver:

Container ID:  docker://c16bccd80985edcf2dad48c526633ca818876c2a9b1a9f3172912495150fc180

Image:         docker.io/bitnami/spark:3.5.0-debian-11-r12

Image ID:      ***@***.***:21ce8a386d1966ae560dbfaaa3d20dc030a0599db9a7ce6ab4de80226ee31cf5

Ports:         7078/TCP, 7079/TCP, 4040/TCP

Host Ports:    0/TCP, 0/TCP, 0/TCP

Args:

  driver

  --properties-file

  /opt/spark/conf/spark.properties

  --class

  org.apache.spark.examples.SparkPi

  local:///opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.0.jar

...

Environment:

  SPARK_USER:                 spark

  SPARK_APPLICATION_ID:       spark-16b480df358240248e9b8a956bd0484c

  SPARK_MASTER_URL:           spark://spk-spark-master-svc:7077

  SPARK_DRIVER_BIND_ADDRESS:   (v1:status.podIP)

  SPARK_LOCAL_DIRS:           /var/data/spark-cb71a698-a5d8-40b3-85ff-c6d2275488e4

  SPARK_CONF_DIR:             /opt/spark/conf

Mounts:

  /opt/bitnami/spark/tmp/secret-dir from k8s-secret-volume (rw)

  /opt/spark/conf from spark-conf-volume-driver (rw)

  /var/data/spark-cb71a698-a5d8-40b3-85ff-c6d2275488e4 from spark-local-dir-1 (rw)

...

Volumes:

k8s-secret-volume:

Type:        Secret (a volume populated by a Secret)

SecretName:  k8s-secret

Optional:    false

spark-local-dir-1:

Type:       EmptyDir (a temporary directory that shares a pod's lifetime)

Medium:

SizeLimit:  <unset>

spark-conf-volume-driver:

Type:        ConfigMap (a volume populated by a ConfigMap)

Name:        spark-drv-ec67b88d841c92c6-conf-map

Optional:    false

...

the spark-env.sh is mounted via configmap spark-drv-ec67b88d841c92c6-conf-map and this already contains the needed LD_PRELOAD value:

$ kubectl describe configmap spark-drv-ec67b88d841c92c6-conf-map

Name: spark-drv-ec67b88d841c92c6-conf-map

Namespace: default

Labels:

Annotations:

Data

====

spark-env.sh:


LD_PRELOAD=/opt/bitnami/common/lib/libnss_wrapper.so

spark.kubernetes.namespace:


default

spark.properties:


Java properties built from Kubernetes config map with name: spark-drv-ec67b88d841c92c6-conf-map

Wed Feb 07 15:08:01 UTC 2024

spark.driver.port=7078

spark.kubernetes.driver.secrets.k8s-secret=/opt/bitnami/spark/tmp/secret-dir

spark.submit.pyFiles=

spark.kubernetes.executor.request.cores=100m

spark.kubernetes.driverEnv.SPARK_MASTER_URL=spark\://spk-spark-master-svc\:7077

spark.kubernetes.executor.secrets.k8s-secret=/opt/bitnami/spark/tmp/secret-dir

spark.kubernetes.resource.type=java

spark.app.submitTime=1707318480437

spark.kubernetes.submitInDriver=true

spark.kubernetes.executor.request.memory=1Gi

spark.kubernetes.authenticate.driver.serviceAccountName=spk-spark

spark.kubernetes.driver.pod.name=spark-pi-1769838d841c8e86-driver

spark.executor.instances=1

spark.master=k8s\://https\://10.96.0.1\:443

spark.app.name=spark-pi

spark.submit.deployMode=cluster

spark.driver.host=spark-pi-1769838d841c8e86-driver-svc.default.svc

spark.driver.blockManager.port=7079

spark.app.id=spark-16b480df358240248e9b8a956bd0484c

spark.kubernetes.container.image=docker.io/bitnami/spark\:3.5.0-debian-11-r12

spark.kubernetes.executor.podNamePrefix=test

spark.kubernetes.memoryOverheadFactor=0.1

spark.kubernetes.authenticate.submission.caCertFile=/opt/bitnami/spark/tmp/secret-dir/k8s_pub.pem

spark.jars=local\:///opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.0.jar

spark.driver.extraJavaOptions=--add-exports java.base/sun.nio.ch\=ALL-UNNAMED

BinaryData

====

in my case, the pod is using the default user ID (1001) to run it. In the values.txt you sent, the user in pod and container security context is set to 1001 (default value). I am not sure how are you getting the issue when the spark-env.sh file is created, does the cluster were you are running spark enforces userIDs somehow ?

— Reply to this email directly, view it on GitHubhttps://github.com/bitnami/containers/issues/52698#issuecomment-1934217650, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3NI433DTXRBPOSOYKIZU6TYSTNFTAVCNFSM6AAAAAA7IWTB46VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZUGIYTONRVGA. You are receiving this because you were mentioned.Message ID: @.***>

lgarg-kimbal commented 2 months ago

facing the same issue.

docker compose file

services:
  spark:
    build: ./Docker/spark/
    environment:
      - SPARK_MODE=master
      - SPARK_RPC_AUTHENTICATION_ENABLED=no
      - SPARK_RPC_ENCRYPTION_ENABLED=no
      - SPARK_LOCAL_STORAGE_ENCRYPTION_ENABLED=no
      - SPARK_SSL_ENABLED=no
      - SPARK_USER=spark
    ports:
      - '8080:8080'
      - '7078:7077'
      # - "8081:8081"
    volumes:
      - ./src:/home/src
  spark-worker:
    build: ./Docker/spark/
    environment:
      - SPARK_MODE=worker
      - SPARK_MASTER_URL=spark://spark:7077
      - SPARK_WORKER_MEMORY=1G
      - SPARK_WORKER_CORES=1
      - SPARK_RPC_AUTHENTICATION_ENABLED=no
      - SPARK_RPC_ENCRYPTION_ENABLED=no
      - SPARK_LOCAL_STORAGE_ENCRYPTION_ENABLED=no
      - SPARK_SSL_ENABLED=no
      - SPARK_USER=spark
    volumes:
      - ./src:/home/src

  postgres:
      image: postgres:latest
      container_name: postgres
      environment:
        - POSTGRES_USER=airflow
        - POSTGRES_PASSWORD=airflow
        - POSTGRES_DB=airflow
      ports:
        - "5432:5432"
      volumes:
        - postgres_data:/var/lib/postgresql/data

  redis:
    image: 'bitnami/redis:latest'
    environment:
      - ALLOW_EMPTY_PASSWORD=yes
    volumes:
      - redis_data:/bitnami
  airflow-scheduler:
    build: ./Docker/airflow_scheduler/
    environment:
      - AIRFLOW_FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
      - AIRFLOW_SECRET_KEY=a25mQ1FHTUh3MnFRSk5KMEIyVVU2YmN0VGRyYTVXY08=
      - AIRFLOW_WEBSERVER_HOST=airflow
      - AIRFLOW_EXECUTOR=LocalExecutor
      - AIRFLOW_DATABASE_HOST=postgres
      - AIRFLOW_DATABASE_NAME=airflow
      - AIRFLOW_DATABASE_USERNAME=airflow
      - AIRFLOW_DATABASE_PASSWORD=airflow
      - AIRFLOW_LOAD_EXAMPLES=no
    volumes:
      - ./src/dags:/opt/bitnami/airflow/dags
      - ./src/jobs:/opt/bitnami/airflow/jobs
      - ./src/configs:/opt/bitnami/airflow/configs
      - ./src/dependencies:/opt/bitnami/airflow/dependencies
      - ./airflow_scheduler_requirements.txt:/bitnami/python/requirements.txt
  airflow:
    image: bitnami/airflow:latest
    environment:
      - AIRFLOW_FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
      - AIRFLOW_SECRET_KEY=a25mQ1FHTUh3MnFRSk5KMEIyVVU2YmN0VGRyYTVXY08=
      - AIRFLOW_EXECUTOR=LocalExecutor
      - AIRFLOW_DATABASE_HOST=postgres
      - AIRFLOW_DATABASE_NAME=airflow
      - AIRFLOW_DATABASE_USERNAME=airflow
      - AIRFLOW_DATABASE_PASSWORD=airflow
      - AIRFLOW_PASSWORD=bitnami123
      - AIRFLOW_USERNAME=user
      - AIRFLOW_EMAIL=user@example.com
      - AIRFLOW_LOAD_EXAMPLES=no
    ports:
      - "8081:8080"
    volumes:
      - ./src/dags:/opt/bitnami/airflow/dags
      - ./src/jobs:/opt/bitnami/airflow/jobs
      - ./src/configs:/opt/bitnami/airflow/configs
      - ./src/dependencies:/opt/bitnami/airflow/dependencies
      - ./airflow_scheduler_requirements.txt:/bitnami/python/requirements.txt
    depends_on:
      - postgres
      - spark
      - spark-worker

volumes:
  postgres_data:
  airflow:
  spark:
  redis_data: 

spark docker file

FROM bitnami/spark:3.5.1
USER root
RUN install_packages curl
USER 1001
RUN curl https://repo1.maven.org/maven2/com/microsoft/sqlserver/mssql-jdbc/12.6.3.jre11/mssql-jdbc-12.6.3.jre11.jar --output /opt/bitnami/spark/jars/mssql-jdbc-12.6.3.jre11.jar

RUN curl https://repo1.maven.org/maven2/org/postgresql/postgresql/42.7.3/postgresql-42.7.3.jar --output /opt/bitnami/spark/jars/postgresql-42.7.3.jar

airflow scheduler file

FROM bitnami/airflow-scheduler:latest
USER root

RUN apt-get update && \
    apt-get install -y wget gnupg && \
    rm -rf /var/lib/apt/lists/*

RUN wget -O /tmp/openjdk-21.tar.gz https://download.java.net/java/GA/jdk21.0.2/f2283984656d49d69e91c558476027ac/13/GPL/openjdk-21.0.2_linux-x64_bin.tar.gz 
RUN    mkdir -p /usr/lib/jvm
RUN    tar -xzf /tmp/openjdk-21.tar.gz -C /usr/lib/jvm 
RUN    mv /usr/lib/jvm/jdk-21.0.2 /usr/lib/jvm/java-21-openjdk-amd64
RUN    rm /tmp/openjdk-21.tar.gz

USER 1001

ENV JAVA_HOME=/usr/lib/jvm/java-21-openjdk-amd64
ENV PATH=$JAVA_HOME/bin:$PATH

ENV SPARK_HOME=/opt/bitnami/airflow/venv/lib/python3.11/site-packages/pyspark
ENV PATH=$SPARK_HOME/bin:$PATH
rafariossaa commented 2 months ago

Hi @lgarg-kimbal, Your scenario is different as you are using compose and custom build images. Do you mind opening a new issue with your case ?. This way we could proper track the issues. Feel free to reference this issue.

rafariossaa commented 2 months ago

@TimVerboisCgk Sorry for the delay, but please use the web site instead of replying to mail.

Did the exec pods start? Because in my case they kept on crashing until they stopped (10 times I believe). And they will disappear.

Maybe in this interval you were able to solve the issue, in anycase, what are you getting in the logs or in the pod state ?

TimVerboisCgk commented 2 months ago

Hello @rafariossaa

So, I retested it and I will try to provide you all the info you need to reproduce the problem, because in the latest version, it is still there.

this is the spark-submit job command I use:

/opt/bitnami/spark/bin/spark-submit \
        --master k8s://https://kubernetes.default:443 \
        --deploy-mode cluster \
        --conf spark.kubernetes.driverEnv.SPARK_MASTER_URL=spark://cegeka-spark-master-0.cegeka-spark-headless.spark.svc.cluster.local:7077 \
        --conf spark.kubernetes.driver.label.sidecar.istio.io/inject=false \
        --conf "spark.kubernetes.driver.service.annotation.prometheus.io/path=/metrics/executors/prometheus" \
        --conf "spark.kubernetes.driver.service.annotation.prometheus.io/port=4040" \
        --conf "spark.kubernetes.driver.service.annotation.prometheus.io/scrape=true" \
        --conf spark.kubernetes.executor.label.sidecar.istio.io/inject=false \
        --conf spark.kubernetes.container.image=bitnami/spark:3.5.1-debian-12-r7 \
        --conf spark.ui.prometheus.enabled=true \
        --conf spark.jars.ivy=/tmp/.ivy \
        --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
        --class org.apache.spark.examples.SparkPi \
        --conf spark.kubernetes.namespace=spark-apps \
        --name spark-pi \
        --conf spark.kubernetes.executor.podNamePrefix=timtest \
        --conf spark.kubernetes.driver.request.cores=100m \
        --conf spark.kubernetes.driver.request.memory=1Gi \
        --conf spark.kubernetes.executor.request.cores=100m \
        --conf spark.kubernetes.executor.request.memory=1Gi \
         local:///opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.1.jar

So, if I run this job, the driver pod is started:

# kubectl get pods -n spark-apps
NAME                               READY   STATUS              RESTARTS   AGE
spark-pi-852879905be2067b-driver   0/1     ContainerCreating   0          1s
# kubectl get pods -n spark-apps
NAME                               READY   STATUS    RESTARTS   AGE
spark-pi-852879905be2067b-driver   1/1     Running   0          4s
# kubectl get pods -n spark-apps
NAME                               READY   STATUS    RESTARTS   AGE
spark-pi-852879905be2067b-driver   1/1     Running   0          9s
# kubectl get pods -n spark-apps
NAME                               READY   STATUS              RESTARTS   AGE
spark-pi-852879905be2067b-driver   1/1     Running             0          10s
timtest-exec-1                     0/1     ContainerCreating   0          1s
timtest-exec-2                     0/1     ContainerCreating   0          1s
# kubectl get pods -n spark-apps
NAME                               READY   STATUS    RESTARTS   AGE
spark-pi-852879905be2067b-driver   1/1     Running   0          11s
timtest-exec-1                     1/1     Running   0          2s
timtest-exec-2                     1/1     Running   0          2s

At this point, I have my command ready to capture the logs:

# kubectl logs timtest-exec-4 -n spark-apps
spark 22:47:54.62 INFO  ==>
spark 22:47:54.62 INFO  ==> Welcome to the Bitnami spark container
spark 22:47:54.62 INFO  ==> Subscribe to project updates by watching https://github.com/bitnami/containers
spark 22:47:54.62 INFO  ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
spark 22:47:54.63 INFO  ==> Upgrade to Tanzu Application Catalog for production environments to access custom-configured and pre-packaged software components. Gain enhanced features, including Software Bill of Materials (SBOM), CVE scan result reports, and VEX documents. To learn more, visit https://bitnami.com/enterprise
spark 22:47:54.63 INFO  ==>

Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
24/06/27 22:47:57 INFO KubernetesExecutorBackend: Started daemon with process name: 1@timtest-exec-4
24/06/27 22:47:57 INFO SignalUtils: Registering signal handler for TERM
24/06/27 22:47:57 INFO SignalUtils: Registering signal handler for HUP
24/06/27 22:47:57 INFO SignalUtils: Registering signal handler for INT
24/06/27 22:47:58 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" org.apache.hadoop.security.KerberosAuthException: failure to login: javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid null input: name
    at jdk.security.auth/com.sun.security.auth.UnixPrincipal.<init>(UnixPrincipal.java:71)
    at jdk.security.auth/com.sun.security.auth.module.UnixLoginModule.login(UnixLoginModule.java:134)
    at java.base/javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
    at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:679)
    at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:677)
    at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
    at java.base/javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:677)
    at java.base/javax.security.auth.login.LoginContext.login(LoginContext.java:587)
    at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:2065)
    at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1975)
    at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:719)
    at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:669)
    at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:579)
    at org.apache.spark.deploy.SparkHadoopUtil.createSparkUser(SparkHadoopUtil.scala:70)
    at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:61)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend$.run(KubernetesExecutorBackend.scala:67)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend$.main(KubernetesExecutorBackend.scala:56)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend.main(KubernetesExecutorBackend.scala)

    at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1986)
    at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:719)
    at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:669)
    at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:579)
    at org.apache.spark.deploy.SparkHadoopUtil.createSparkUser(SparkHadoopUtil.scala:70)
    at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:61)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend$.run(KubernetesExecutorBackend.scala:67)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend$.main(KubernetesExecutorBackend.scala:56)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend.main(KubernetesExecutorBackend.scala)
Caused by: javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid null input: name
    at jdk.security.auth/com.sun.security.auth.UnixPrincipal.<init>(UnixPrincipal.java:71)
    at jdk.security.auth/com.sun.security.auth.module.UnixLoginModule.login(UnixLoginModule.java:134)
    at java.base/javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
    at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:679)
    at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:677)
    at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
    at java.base/javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:677)
    at java.base/javax.security.auth.login.LoginContext.login(LoginContext.java:587)
    at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:2065)
    at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1975)
    at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:719)
    at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:669)
    at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:579)
    at org.apache.spark.deploy.SparkHadoopUtil.createSparkUser(SparkHadoopUtil.scala:70)
    at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:61)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend$.run(KubernetesExecutorBackend.scala:67)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend$.main(KubernetesExecutorBackend.scala:56)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend.main(KubernetesExecutorBackend.scala)

    at java.base/javax.security.auth.login.LoginContext.invoke(LoginContext.java:850)
    at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:679)
    at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:677)
    at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
    at java.base/javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:677)
    at java.base/javax.security.auth.login.LoginContext.login(LoginContext.java:587)
    at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:2065)
    at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1975)
    ... 8 more

The problem is thus also the title of this issue:

javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid null input: name

I explained what the problem behind this is and proposed a patch. Which we are actively using in production. But, now when we adopt a new version, I have to bring the patch to the new version, so I'd rather have you fix it, even if it is completely different from my patch.

TimVerboisCgk commented 2 months ago

So, I added my fix to the latest version of the container and tried the above again, but then with my container and this is the result:

# kubectl get pods -n spark-apps
NAME                               READY   STATUS              RESTARTS   AGE
spark-pi-d9deaa905dea2f4f-driver   0/1     ContainerCreating   0          1s
# kubectl get pods -n spark-apps
NAME                               READY   STATUS    RESTARTS   AGE
spark-pi-d9deaa905dea2f4f-driver   1/1     Running   0          4s
# kubectl get pods -n spark-apps
NAME                               READY   STATUS              RESTARTS   AGE
spark-pi-d9deaa905dea2f4f-driver   1/1     Running             0          11s
timtest-exec-1                     0/1     ContainerCreating   0          1s
timtest-exec-2                     0/1     ContainerCreating   0          1s
# kubectl get pods -n spark-apps
NAME                               READY   STATUS    RESTARTS   AGE
spark-pi-d9deaa905dea2f4f-driver   1/1     Running   0          13s
timtest-exec-1                     1/1     Running   0          3s
timtest-exec-2                     1/1     Running   0          3s
# kubectl get pods -n spark-apps
NAME                               READY   STATUS        RESTARTS   AGE
spark-pi-d9deaa905dea2f4f-driver   1/1     Running       0          16s
timtest-exec-1                     1/1     Terminating   0          6s
timtest-exec-2                     1/1     Running       0          6s
# kubectl get pods -n spark-apps
NAME                               READY   STATUS      RESTARTS   AGE
spark-pi-d9deaa905dea2f4f-driver   0/1     Completed   0          18s

If I look at the logs of the executor pods, I can see:

spark 08:21:48.83 INFO  ==>
spark 08:21:48.83 INFO  ==> Welcome to the Bitnami spark container
spark 08:21:48.84 INFO  ==> Subscribe to project updates by watching https://github.com/bitnami/containers
spark 08:21:48.84 INFO  ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
spark 08:21:48.84 INFO  ==> Upgrade to Tanzu Application Catalog for production environments to access custom-configured and pre-packaged software components. Gain enhanced features, including Software Bill of Materials (SBOM), CVE scan result reports, and VEX documents. To learn more, visit https://bitnami.com/enterprise
spark 08:21:48.84 INFO  ==>

Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
24/06/28 08:21:51 INFO KubernetesExecutorBackend: Started daemon with process name: 1@timtest-exec-1
24/06/28 08:21:51 INFO SignalUtils: Registering signal handler for TERM
24/06/28 08:21:51 INFO SignalUtils: Registering signal handler for HUP
24/06/28 08:21:51 INFO SignalUtils: Registering signal handler for INT
24/06/28 08:21:52 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
24/06/28 08:21:52 INFO SecurityManager: Changing view acls to: spark
24/06/28 08:21:52 INFO SecurityManager: Changing modify acls to: spark
24/06/28 08:21:52 INFO SecurityManager: Changing view acls groups to:
24/06/28 08:21:52 INFO SecurityManager: Changing modify acls groups to:
24/06/28 08:21:52 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: spark; groups with view permissions: EMPTY; users with modify permissions: spark; groups with modify permissions: EMPTY
24/06/28 08:21:53 INFO TransportClientFactory: Successfully created connection to spark-pi-4c19f3905def8cba-driver-svc.spark-apps.svc/10.244.36.70:7078 after 134 ms (0 ms spent in bootstraps)
24/06/28 08:21:53 INFO SecurityManager: Changing view acls to: spark
24/06/28 08:21:53 INFO SecurityManager: Changing modify acls to: spark
24/06/28 08:21:53 INFO SecurityManager: Changing view acls groups to:
24/06/28 08:21:53 INFO SecurityManager: Changing modify acls groups to:
24/06/28 08:21:53 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: spark; groups with view permissions: EMPTY; users with modify permissions: spark; groups with modify permissions: EMPTY
24/06/28 08:21:53 INFO TransportClientFactory: Successfully created connection to spark-pi-4c19f3905def8cba-driver-svc.spark-apps.svc/10.244.36.70:7078 after 3 ms (0 ms spent in bootstraps)
24/06/28 08:21:53 INFO DiskBlockManager: Created local directory at /var/data/spark-c83a972e-7fb9-41a1-aedf-011abedef1ef/blockmgr-fddb1182-8537-4c1d-a510-242c49c666b4
24/06/28 08:21:53 INFO MemoryStore: MemoryStore started with capacity 413.9 MiB
24/06/28 08:21:54 INFO CoarseGrainedExecutorBackend: Connecting to driver: spark://CoarseGrainedScheduler@spark-pi-4c19f3905def8cba-driver-svc.spark-apps.svc:7078
24/06/28 08:21:54 INFO ResourceUtils: ==============================================================
24/06/28 08:21:54 INFO ResourceUtils: No custom resources configured for spark.executor.
24/06/28 08:21:54 INFO ResourceUtils: ==============================================================
24/06/28 08:21:54 INFO CoarseGrainedExecutorBackend: Successfully registered with driver
24/06/28 08:21:54 INFO Executor: Starting executor ID 1 on host 10.244.98.121
24/06/28 08:21:54 INFO Executor: OS info Linux, 5.15.0-107-generic, amd64
24/06/28 08:21:54 INFO Executor: Java version 17.0.11
24/06/28 08:21:54 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 39283.
24/06/28 08:21:54 INFO NettyBlockTransferService: Server created on 10.244.98.121:39283
24/06/28 08:21:54 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
24/06/28 08:21:54 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(1, 10.244.98.121, 39283, None)
24/06/28 08:21:54 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(1, 10.244.98.121, 39283, None)
24/06/28 08:21:54 INFO BlockManager: Initialized BlockManager: BlockManagerId(1, 10.244.98.121, 39283, None)
24/06/28 08:21:54 INFO Executor: Starting executor with user classpath (userClassPathFirst = false): ''
24/06/28 08:21:54 INFO Executor: Created or updated repl class loader org.apache.spark.util.MutableURLClassLoader@6fb151d6 for default.
24/06/28 08:21:54 INFO Executor: Fetching file:/opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.1.jar with timestamp 1719562904826
24/06/28 08:21:54 INFO Utils: Copying /opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.1.jar to /var/data/spark-c83a972e-7fb9-41a1-aedf-011abedef1ef/spark-1bd7d629-411e-4bf5-945b-ee142934d83b/66015011719562904826_cache
24/06/28 08:21:54 INFO Utils: Copying /var/data/spark-c83a972e-7fb9-41a1-aedf-011abedef1ef/spark-1bd7d629-411e-4bf5-945b-ee142934d83b/66015011719562904826_cache to /opt/bitnami/spark/./spark-examples_2.12-3.5.1.jar
24/06/28 08:21:54 INFO Executor: Adding file:/opt/bitnami/spark/./spark-examples_2.12-3.5.1.jar to class loader default
24/06/28 08:21:55 INFO CoarseGrainedExecutorBackend: Got assigned task 1
24/06/28 08:21:55 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
24/06/28 08:21:56 INFO TorrentBroadcast: Started reading broadcast variable 0 with 1 pieces (estimated total size 4.0 MiB)
24/06/28 08:21:56 INFO TransportClientFactory: Successfully created connection to spark-pi-4c19f3905def8cba-driver-svc.spark-apps.svc/10.244.36.70:7079 after 13 ms (0 ms spent in bootstraps)
24/06/28 08:21:56 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 2.3 KiB, free 413.9 MiB)
24/06/28 08:21:56 INFO TorrentBroadcast: Reading broadcast variable 0 took 287 ms
24/06/28 08:21:56 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 4.0 KiB, free 413.9 MiB)
24/06/28 08:21:56 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 1055 bytes result sent to driver
24/06/28 08:21:58 INFO CoarseGrainedExecutorBackend: Driver commanded a shutdown
24/06/28 08:21:58 INFO MemoryStore: MemoryStore cleared
24/06/28 08:21:58 INFO BlockManager: BlockManager stopped
24/06/28 08:21:58 INFO ShutdownHookManager: Shutdown hook called
24/06/28 08:21:58 INFO ShutdownHookManager: Deleting directory /var/data/spark-c83a972e-7fb9-41a1-aedf-011abedef1ef/spark-1bd7d629-411e-4bf5-945b-ee142934d83b
TimVerboisCgk commented 2 months ago

@lgarg-kimbal

I've made my fixed image publicly available for you to try. If it works with this image, it is the same bug that is bothering you, even though you are using dockercompose.

verboistim/spark:3.5.1-debian-12-r7-bugfix
TimVerboisCgk commented 2 months ago

Rafael,

I already did last year:

https://github.com/bitnami/containers/pull/52661

But it was rejected by Michiel (mdhont) because it was not according to the strategy you guys used and that is fine by me. Just fix it another way then please.

Met vriendelijke groeten / Meilleures salutations / Best regards Tim Verbois Product Delivery Owner

From: Rafael Ríos Saavedra @.> Date: Tuesday, 2 July 2024 at 09:19 To: bitnami/containers @.> Cc: Tim Verbois @.>, Mention @.> Subject: Re: [bitnami/containers] [SPARK] lang.NullPointerException: invalid null input: name (Issue #52698)

Hi, Would you like to contribute an send a PR with you patch ? We will be glad to review and merge it.

— Reply to this email directly, view it on GitHubhttps://github.com/bitnami/containers/issues/52698#issuecomment-2202146279, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3NI4366M63D6TKDL4XEJLLZKJH7XAVCNFSM6AAAAAA7IWTB46VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMBSGE2DMMRXHE. You are receiving this because you were mentioned.Message ID: @.***>

github-actions[bot] commented 2 months ago

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

github-actions[bot] commented 1 month ago

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.

salehjafarli commented 1 month ago

Hi, I am facing this issue as well. Any updates on this?

KhurtinDN commented 1 month ago

I found problem with

LD_PRELOAD=$LIBNSS_WRAPPER_PATH

https://github.com/bitnami/containers/blob/f42116e177933a61ba4beaf3c80233c01d569746/bitnami/spark/3.5/debian-12/rootfs/opt/bitnami/scripts/spark/entrypoint.sh#L30

This part was badly written.

I use k8s mode, so my workaround is

--conf spark.executorEnv.LD_PRELOAD=/opt/bitnami/common/lib/libnss_wrapper.so

Could you refactor this part? Right now spark not use /opt/bitnami/spark/conf/spark-env.sh with 'LD_PRELOAD'