Closed TimVerboisCgk closed 4 months ago
Thank you for bringing this issue to our attention. We appreciate your involvement! If you're interested in contributing a solution, we welcome you to create a pull request. The Bitnami team is excited to review your submission and offer feedback. You can find the contributing guidelines here.
Your contribution will greatly benefit the community. Feel free to reach out if you have any questions or need assistance.
Apparently, I made my issue in the wrong repository, so I created a PR in the Container repository: #52661.
Thank you for the help
Thank you for opening this issue and submitting the associated Pull Request. Our team will review and provide feedback. Once the PR is merged, the issue will automatically close.
Your contribution is greatly appreciated!
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.
Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.
Hi, I would like to try to reproduce the issue, could you indicate the params (or any other step) used to deploy the chart ?
Hi @rafariossaa,
thank you for the reply. I've set it up a while back and I'm pretty sure this is the values file:
I used for the helm chart.
Thanks @TimVerboisCgk, I am continuing working on this.
Hi, I am running this:
spark-submit \
--class org.apache.spark.examples.SparkPi \
--conf spark.kubernetes.container.image=docker.io/bitnami/spark:3.5.0-debian-11-r12 \
--conf spark.kubernetes.driverEnv.SPARK_MASTER_URL=spark://spk-spark-master-svc:7077 \
--deploy-mode cluster \
--name spark-pi \
--master k8s://https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spk-spark \
--conf spark.kubernetes.executor.podNamePrefix=test \
--conf spark.kubernetes.executor.request.cores=100m \
--conf spark.kubernetes.executor.request.memory=1Gi \
--conf spark.executor.instances=1 \
--conf spark.kubernetes.driver.secrets.k8s-secret=/opt/bitnami/spark/tmp/secret-dir \
--conf spark.kubernetes.executor.secrets.k8s-secret=/opt/bitnami/spark/tmp/secret-dir \
--conf spark.kubernetes.authenticate.submission.caCertFile=/opt/bitnami/spark/tmp/secret-dir/k8s_pub.pem \
local:///opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.0.jar
and a pod was created:
$ kubectl describe pod spark-pi-1769838d841c8e86-driver
Name: spark-pi-1769838d841c8e86-driver
Namespace: default
Priority: 0
Service Account: spk-spark
Node: minikube/192.168.49.2
...
Containers:
spark-kubernetes-driver:
Container ID: docker://c16bccd80985edcf2dad48c526633ca818876c2a9b1a9f3172912495150fc180
Image: docker.io/bitnami/spark:3.5.0-debian-11-r12
Image ID: docker-pullable://bitnami/spark@sha256:21ce8a386d1966ae560dbfaaa3d20dc030a0599db9a7ce6ab4de80226ee31cf5
Ports: 7078/TCP, 7079/TCP, 4040/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
Args:
driver
--properties-file
/opt/spark/conf/spark.properties
--class
org.apache.spark.examples.SparkPi
local:///opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.0.jar
...
Environment:
SPARK_USER: spark
SPARK_APPLICATION_ID: spark-16b480df358240248e9b8a956bd0484c
SPARK_MASTER_URL: spark://spk-spark-master-svc:7077
SPARK_DRIVER_BIND_ADDRESS: (v1:status.podIP)
SPARK_LOCAL_DIRS: /var/data/spark-cb71a698-a5d8-40b3-85ff-c6d2275488e4
SPARK_CONF_DIR: /opt/spark/conf
Mounts:
/opt/bitnami/spark/tmp/secret-dir from k8s-secret-volume (rw)
/opt/spark/conf from spark-conf-volume-driver (rw)
/var/data/spark-cb71a698-a5d8-40b3-85ff-c6d2275488e4 from spark-local-dir-1 (rw)
...
Volumes:
k8s-secret-volume:
Type: Secret (a volume populated by a Secret)
SecretName: k8s-secret
Optional: false
spark-local-dir-1:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
spark-conf-volume-driver:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: spark-drv-ec67b88d841c92c6-conf-map
Optional: false
...
the spark-env.sh
is mounted via configmap spark-drv-ec67b88d841c92c6-conf-map
and this already contains the needed LD_PRELOAD
value:
$ kubectl describe configmap spark-drv-ec67b88d841c92c6-conf-map
Name: spark-drv-ec67b88d841c92c6-conf-map
Namespace: default
Labels: <none>
Annotations: <none>
Data
====
spark-env.sh:
----
LD_PRELOAD=/opt/bitnami/common/lib/libnss_wrapper.so
spark.kubernetes.namespace:
----
default
spark.properties:
----
#Java properties built from Kubernetes config map with name: spark-drv-ec67b88d841c92c6-conf-map
#Wed Feb 07 15:08:01 UTC 2024
spark.driver.port=7078
spark.kubernetes.driver.secrets.k8s-secret=/opt/bitnami/spark/tmp/secret-dir
spark.submit.pyFiles=
spark.kubernetes.executor.request.cores=100m
spark.kubernetes.driverEnv.SPARK_MASTER_URL=spark\://spk-spark-master-svc\:7077
spark.kubernetes.executor.secrets.k8s-secret=/opt/bitnami/spark/tmp/secret-dir
spark.kubernetes.resource.type=java
spark.app.submitTime=1707318480437
spark.kubernetes.submitInDriver=true
spark.kubernetes.executor.request.memory=1Gi
spark.kubernetes.authenticate.driver.serviceAccountName=spk-spark
spark.kubernetes.driver.pod.name=spark-pi-1769838d841c8e86-driver
spark.executor.instances=1
spark.master=k8s\://https\://10.96.0.1\:443
spark.app.name=spark-pi
spark.submit.deployMode=cluster
spark.driver.host=spark-pi-1769838d841c8e86-driver-svc.default.svc
spark.driver.blockManager.port=7079
spark.app.id=spark-16b480df358240248e9b8a956bd0484c
spark.kubernetes.container.image=docker.io/bitnami/spark\:3.5.0-debian-11-r12
spark.kubernetes.executor.podNamePrefix=test
spark.kubernetes.memoryOverheadFactor=0.1
spark.kubernetes.authenticate.submission.caCertFile=/opt/bitnami/spark/tmp/secret-dir/k8s_pub.pem
spark.jars=local\:///opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.0.jar
spark.driver.extraJavaOptions=--add-exports java.base/sun.nio.ch\=ALL-UNNAMED
BinaryData
====
in my case, the pod is using the default user ID (1001) to run it.
In the values.txt
you sent, the user in pod and container security context is set to 1001
(default value).
I am not sure how are you getting the issue when the spark-env.sh
file is created, does the cluster were you are running spark enforces userIDs somehow ?
Rafael,
Did the exec pods start? Because in my case they kept on crashing until they stopped (10 times I believe). And they will disappear.
Met vriendelijke groeten / Meilleures salutations / Best regards Tim Verbois System Engineer
From: Rafael Ríos Saavedra @.> Date: Thursday, 8 February 2024 at 15:19 To: bitnami/containers @.> Cc: Tim Verbois @.>, Mention @.> Subject: Re: [bitnami/containers] [SPARK] lang.NullPointerException: invalid null input: name (Issue #52698) U ontvangt niet vaak e-mail van @.*** Meer informatie over waarom dit belangrijk ishttps://aka.ms/LearnAboutSenderIdentification
Hi, I am running this:
spark-submit \
--class org.apache.spark.examples.SparkPi \
--conf spark.kubernetes.container.image=docker.io/bitnami/spark:3.5.0-debian-11-r12 \
--conf spark.kubernetes.driverEnv.SPARK_MASTER_URL=spark://spk-spark-master-svc:7077 \
--deploy-mode cluster \
--name spark-pi \
--master k8s://https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spk-spark \
--conf spark.kubernetes.executor.podNamePrefix=test \
--conf spark.kubernetes.executor.request.cores=100m \
--conf spark.kubernetes.executor.request.memory=1Gi \
--conf spark.executor.instances=1 \
--conf spark.kubernetes.driver.secrets.k8s-secret=/opt/bitnami/spark/tmp/secret-dir \
--conf spark.kubernetes.executor.secrets.k8s-secret=/opt/bitnami/spark/tmp/secret-dir \
--conf spark.kubernetes.authenticate.submission.caCertFile=/opt/bitnami/spark/tmp/secret-dir/k8s_pub.pem \
local:///opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.0.jar
and a pod was created:
$ kubectl describe pod spark-pi-1769838d841c8e86-driver
Name: spark-pi-1769838d841c8e86-driver
Namespace: default
Priority: 0
Service Account: spk-spark
Node: minikube/192.168.49.2
...
Containers:
spark-kubernetes-driver:
Container ID: docker://c16bccd80985edcf2dad48c526633ca818876c2a9b1a9f3172912495150fc180
Image: docker.io/bitnami/spark:3.5.0-debian-11-r12
Image ID: ***@***.***:21ce8a386d1966ae560dbfaaa3d20dc030a0599db9a7ce6ab4de80226ee31cf5
Ports: 7078/TCP, 7079/TCP, 4040/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
Args:
driver
--properties-file
/opt/spark/conf/spark.properties
--class
org.apache.spark.examples.SparkPi
local:///opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.0.jar
...
Environment:
SPARK_USER: spark
SPARK_APPLICATION_ID: spark-16b480df358240248e9b8a956bd0484c
SPARK_MASTER_URL: spark://spk-spark-master-svc:7077
SPARK_DRIVER_BIND_ADDRESS: (v1:status.podIP)
SPARK_LOCAL_DIRS: /var/data/spark-cb71a698-a5d8-40b3-85ff-c6d2275488e4
SPARK_CONF_DIR: /opt/spark/conf
Mounts:
/opt/bitnami/spark/tmp/secret-dir from k8s-secret-volume (rw)
/opt/spark/conf from spark-conf-volume-driver (rw)
/var/data/spark-cb71a698-a5d8-40b3-85ff-c6d2275488e4 from spark-local-dir-1 (rw)
...
Volumes:
k8s-secret-volume:
Type: Secret (a volume populated by a Secret)
SecretName: k8s-secret
Optional: false
spark-local-dir-1:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
spark-conf-volume-driver:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: spark-drv-ec67b88d841c92c6-conf-map
Optional: false
...
the spark-env.sh is mounted via configmap spark-drv-ec67b88d841c92c6-conf-map and this already contains the needed LD_PRELOAD value:
$ kubectl describe configmap spark-drv-ec67b88d841c92c6-conf-map
Name: spark-drv-ec67b88d841c92c6-conf-map
Namespace: default
Labels:
Annotations:
Data
====
spark-env.sh:
LD_PRELOAD=/opt/bitnami/common/lib/libnss_wrapper.so
spark.kubernetes.namespace:
default
spark.properties:
spark.driver.port=7078
spark.kubernetes.driver.secrets.k8s-secret=/opt/bitnami/spark/tmp/secret-dir
spark.submit.pyFiles=
spark.kubernetes.executor.request.cores=100m
spark.kubernetes.driverEnv.SPARK_MASTER_URL=spark\://spk-spark-master-svc\:7077
spark.kubernetes.executor.secrets.k8s-secret=/opt/bitnami/spark/tmp/secret-dir
spark.kubernetes.resource.type=java
spark.app.submitTime=1707318480437
spark.kubernetes.submitInDriver=true
spark.kubernetes.executor.request.memory=1Gi
spark.kubernetes.authenticate.driver.serviceAccountName=spk-spark
spark.kubernetes.driver.pod.name=spark-pi-1769838d841c8e86-driver
spark.executor.instances=1
spark.master=k8s\://https\://10.96.0.1\:443
spark.app.name=spark-pi
spark.submit.deployMode=cluster
spark.driver.host=spark-pi-1769838d841c8e86-driver-svc.default.svc
spark.driver.blockManager.port=7079
spark.app.id=spark-16b480df358240248e9b8a956bd0484c
spark.kubernetes.container.image=docker.io/bitnami/spark\:3.5.0-debian-11-r12
spark.kubernetes.executor.podNamePrefix=test
spark.kubernetes.memoryOverheadFactor=0.1
spark.kubernetes.authenticate.submission.caCertFile=/opt/bitnami/spark/tmp/secret-dir/k8s_pub.pem
spark.jars=local\:///opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.0.jar
spark.driver.extraJavaOptions=--add-exports java.base/sun.nio.ch\=ALL-UNNAMED
BinaryData
====
in my case, the pod is using the default user ID (1001) to run it. In the values.txt you sent, the user in pod and container security context is set to 1001 (default value). I am not sure how are you getting the issue when the spark-env.sh file is created, does the cluster were you are running spark enforces userIDs somehow ?
— Reply to this email directly, view it on GitHubhttps://github.com/bitnami/containers/issues/52698#issuecomment-1934217650, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3NI433DTXRBPOSOYKIZU6TYSTNFTAVCNFSM6AAAAAA7IWTB46VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZUGIYTONRVGA. You are receiving this because you were mentioned.Message ID: @.***>
facing the same issue.
docker compose file
services:
spark:
build: ./Docker/spark/
environment:
- SPARK_MODE=master
- SPARK_RPC_AUTHENTICATION_ENABLED=no
- SPARK_RPC_ENCRYPTION_ENABLED=no
- SPARK_LOCAL_STORAGE_ENCRYPTION_ENABLED=no
- SPARK_SSL_ENABLED=no
- SPARK_USER=spark
ports:
- '8080:8080'
- '7078:7077'
# - "8081:8081"
volumes:
- ./src:/home/src
spark-worker:
build: ./Docker/spark/
environment:
- SPARK_MODE=worker
- SPARK_MASTER_URL=spark://spark:7077
- SPARK_WORKER_MEMORY=1G
- SPARK_WORKER_CORES=1
- SPARK_RPC_AUTHENTICATION_ENABLED=no
- SPARK_RPC_ENCRYPTION_ENABLED=no
- SPARK_LOCAL_STORAGE_ENCRYPTION_ENABLED=no
- SPARK_SSL_ENABLED=no
- SPARK_USER=spark
volumes:
- ./src:/home/src
postgres:
image: postgres:latest
container_name: postgres
environment:
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=airflow
- POSTGRES_DB=airflow
ports:
- "5432:5432"
volumes:
- postgres_data:/var/lib/postgresql/data
redis:
image: 'bitnami/redis:latest'
environment:
- ALLOW_EMPTY_PASSWORD=yes
volumes:
- redis_data:/bitnami
airflow-scheduler:
build: ./Docker/airflow_scheduler/
environment:
- AIRFLOW_FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
- AIRFLOW_SECRET_KEY=a25mQ1FHTUh3MnFRSk5KMEIyVVU2YmN0VGRyYTVXY08=
- AIRFLOW_WEBSERVER_HOST=airflow
- AIRFLOW_EXECUTOR=LocalExecutor
- AIRFLOW_DATABASE_HOST=postgres
- AIRFLOW_DATABASE_NAME=airflow
- AIRFLOW_DATABASE_USERNAME=airflow
- AIRFLOW_DATABASE_PASSWORD=airflow
- AIRFLOW_LOAD_EXAMPLES=no
volumes:
- ./src/dags:/opt/bitnami/airflow/dags
- ./src/jobs:/opt/bitnami/airflow/jobs
- ./src/configs:/opt/bitnami/airflow/configs
- ./src/dependencies:/opt/bitnami/airflow/dependencies
- ./airflow_scheduler_requirements.txt:/bitnami/python/requirements.txt
airflow:
image: bitnami/airflow:latest
environment:
- AIRFLOW_FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
- AIRFLOW_SECRET_KEY=a25mQ1FHTUh3MnFRSk5KMEIyVVU2YmN0VGRyYTVXY08=
- AIRFLOW_EXECUTOR=LocalExecutor
- AIRFLOW_DATABASE_HOST=postgres
- AIRFLOW_DATABASE_NAME=airflow
- AIRFLOW_DATABASE_USERNAME=airflow
- AIRFLOW_DATABASE_PASSWORD=airflow
- AIRFLOW_PASSWORD=bitnami123
- AIRFLOW_USERNAME=user
- AIRFLOW_EMAIL=user@example.com
- AIRFLOW_LOAD_EXAMPLES=no
ports:
- "8081:8080"
volumes:
- ./src/dags:/opt/bitnami/airflow/dags
- ./src/jobs:/opt/bitnami/airflow/jobs
- ./src/configs:/opt/bitnami/airflow/configs
- ./src/dependencies:/opt/bitnami/airflow/dependencies
- ./airflow_scheduler_requirements.txt:/bitnami/python/requirements.txt
depends_on:
- postgres
- spark
- spark-worker
volumes:
postgres_data:
airflow:
spark:
redis_data:
spark docker file
FROM bitnami/spark:3.5.1
USER root
RUN install_packages curl
USER 1001
RUN curl https://repo1.maven.org/maven2/com/microsoft/sqlserver/mssql-jdbc/12.6.3.jre11/mssql-jdbc-12.6.3.jre11.jar --output /opt/bitnami/spark/jars/mssql-jdbc-12.6.3.jre11.jar
RUN curl https://repo1.maven.org/maven2/org/postgresql/postgresql/42.7.3/postgresql-42.7.3.jar --output /opt/bitnami/spark/jars/postgresql-42.7.3.jar
airflow scheduler file
FROM bitnami/airflow-scheduler:latest
USER root
RUN apt-get update && \
apt-get install -y wget gnupg && \
rm -rf /var/lib/apt/lists/*
RUN wget -O /tmp/openjdk-21.tar.gz https://download.java.net/java/GA/jdk21.0.2/f2283984656d49d69e91c558476027ac/13/GPL/openjdk-21.0.2_linux-x64_bin.tar.gz
RUN mkdir -p /usr/lib/jvm
RUN tar -xzf /tmp/openjdk-21.tar.gz -C /usr/lib/jvm
RUN mv /usr/lib/jvm/jdk-21.0.2 /usr/lib/jvm/java-21-openjdk-amd64
RUN rm /tmp/openjdk-21.tar.gz
USER 1001
ENV JAVA_HOME=/usr/lib/jvm/java-21-openjdk-amd64
ENV PATH=$JAVA_HOME/bin:$PATH
ENV SPARK_HOME=/opt/bitnami/airflow/venv/lib/python3.11/site-packages/pyspark
ENV PATH=$SPARK_HOME/bin:$PATH
Hi @lgarg-kimbal, Your scenario is different as you are using compose and custom build images. Do you mind opening a new issue with your case ?. This way we could proper track the issues. Feel free to reference this issue.
@TimVerboisCgk Sorry for the delay, but please use the web site instead of replying to mail.
Did the exec pods start? Because in my case they kept on crashing until they stopped (10 times I believe). And they will disappear.
Maybe in this interval you were able to solve the issue, in anycase, what are you getting in the logs or in the pod state ?
Hello @rafariossaa
So, I retested it and I will try to provide you all the info you need to reproduce the problem, because in the latest version, it is still there.
this is the spark-submit job command I use:
/opt/bitnami/spark/bin/spark-submit \
--master k8s://https://kubernetes.default:443 \
--deploy-mode cluster \
--conf spark.kubernetes.driverEnv.SPARK_MASTER_URL=spark://cegeka-spark-master-0.cegeka-spark-headless.spark.svc.cluster.local:7077 \
--conf spark.kubernetes.driver.label.sidecar.istio.io/inject=false \
--conf "spark.kubernetes.driver.service.annotation.prometheus.io/path=/metrics/executors/prometheus" \
--conf "spark.kubernetes.driver.service.annotation.prometheus.io/port=4040" \
--conf "spark.kubernetes.driver.service.annotation.prometheus.io/scrape=true" \
--conf spark.kubernetes.executor.label.sidecar.istio.io/inject=false \
--conf spark.kubernetes.container.image=bitnami/spark:3.5.1-debian-12-r7 \
--conf spark.ui.prometheus.enabled=true \
--conf spark.jars.ivy=/tmp/.ivy \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--class org.apache.spark.examples.SparkPi \
--conf spark.kubernetes.namespace=spark-apps \
--name spark-pi \
--conf spark.kubernetes.executor.podNamePrefix=timtest \
--conf spark.kubernetes.driver.request.cores=100m \
--conf spark.kubernetes.driver.request.memory=1Gi \
--conf spark.kubernetes.executor.request.cores=100m \
--conf spark.kubernetes.executor.request.memory=1Gi \
local:///opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.1.jar
So, if I run this job, the driver pod is started:
# kubectl get pods -n spark-apps
NAME READY STATUS RESTARTS AGE
spark-pi-852879905be2067b-driver 0/1 ContainerCreating 0 1s
# kubectl get pods -n spark-apps
NAME READY STATUS RESTARTS AGE
spark-pi-852879905be2067b-driver 1/1 Running 0 4s
# kubectl get pods -n spark-apps
NAME READY STATUS RESTARTS AGE
spark-pi-852879905be2067b-driver 1/1 Running 0 9s
# kubectl get pods -n spark-apps
NAME READY STATUS RESTARTS AGE
spark-pi-852879905be2067b-driver 1/1 Running 0 10s
timtest-exec-1 0/1 ContainerCreating 0 1s
timtest-exec-2 0/1 ContainerCreating 0 1s
# kubectl get pods -n spark-apps
NAME READY STATUS RESTARTS AGE
spark-pi-852879905be2067b-driver 1/1 Running 0 11s
timtest-exec-1 1/1 Running 0 2s
timtest-exec-2 1/1 Running 0 2s
At this point, I have my command ready to capture the logs:
# kubectl logs timtest-exec-4 -n spark-apps
spark 22:47:54.62 INFO ==>
spark 22:47:54.62 INFO ==> Welcome to the Bitnami spark container
spark 22:47:54.62 INFO ==> Subscribe to project updates by watching https://github.com/bitnami/containers
spark 22:47:54.62 INFO ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
spark 22:47:54.63 INFO ==> Upgrade to Tanzu Application Catalog for production environments to access custom-configured and pre-packaged software components. Gain enhanced features, including Software Bill of Materials (SBOM), CVE scan result reports, and VEX documents. To learn more, visit https://bitnami.com/enterprise
spark 22:47:54.63 INFO ==>
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
24/06/27 22:47:57 INFO KubernetesExecutorBackend: Started daemon with process name: 1@timtest-exec-4
24/06/27 22:47:57 INFO SignalUtils: Registering signal handler for TERM
24/06/27 22:47:57 INFO SignalUtils: Registering signal handler for HUP
24/06/27 22:47:57 INFO SignalUtils: Registering signal handler for INT
24/06/27 22:47:58 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" org.apache.hadoop.security.KerberosAuthException: failure to login: javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid null input: name
at jdk.security.auth/com.sun.security.auth.UnixPrincipal.<init>(UnixPrincipal.java:71)
at jdk.security.auth/com.sun.security.auth.module.UnixLoginModule.login(UnixLoginModule.java:134)
at java.base/javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:679)
at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:677)
at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
at java.base/javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:677)
at java.base/javax.security.auth.login.LoginContext.login(LoginContext.java:587)
at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:2065)
at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1975)
at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:719)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:669)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:579)
at org.apache.spark.deploy.SparkHadoopUtil.createSparkUser(SparkHadoopUtil.scala:70)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:61)
at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend$.run(KubernetesExecutorBackend.scala:67)
at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend$.main(KubernetesExecutorBackend.scala:56)
at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend.main(KubernetesExecutorBackend.scala)
at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1986)
at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:719)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:669)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:579)
at org.apache.spark.deploy.SparkHadoopUtil.createSparkUser(SparkHadoopUtil.scala:70)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:61)
at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend$.run(KubernetesExecutorBackend.scala:67)
at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend$.main(KubernetesExecutorBackend.scala:56)
at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend.main(KubernetesExecutorBackend.scala)
Caused by: javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid null input: name
at jdk.security.auth/com.sun.security.auth.UnixPrincipal.<init>(UnixPrincipal.java:71)
at jdk.security.auth/com.sun.security.auth.module.UnixLoginModule.login(UnixLoginModule.java:134)
at java.base/javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:679)
at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:677)
at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
at java.base/javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:677)
at java.base/javax.security.auth.login.LoginContext.login(LoginContext.java:587)
at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:2065)
at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1975)
at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:719)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:669)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:579)
at org.apache.spark.deploy.SparkHadoopUtil.createSparkUser(SparkHadoopUtil.scala:70)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:61)
at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend$.run(KubernetesExecutorBackend.scala:67)
at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend$.main(KubernetesExecutorBackend.scala:56)
at org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend.main(KubernetesExecutorBackend.scala)
at java.base/javax.security.auth.login.LoginContext.invoke(LoginContext.java:850)
at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:679)
at java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:677)
at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
at java.base/javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:677)
at java.base/javax.security.auth.login.LoginContext.login(LoginContext.java:587)
at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:2065)
at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1975)
... 8 more
The problem is thus also the title of this issue:
javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid null input: name
I explained what the problem behind this is and proposed a patch. Which we are actively using in production. But, now when we adopt a new version, I have to bring the patch to the new version, so I'd rather have you fix it, even if it is completely different from my patch.
So, I added my fix to the latest version of the container and tried the above again, but then with my container and this is the result:
# kubectl get pods -n spark-apps
NAME READY STATUS RESTARTS AGE
spark-pi-d9deaa905dea2f4f-driver 0/1 ContainerCreating 0 1s
# kubectl get pods -n spark-apps
NAME READY STATUS RESTARTS AGE
spark-pi-d9deaa905dea2f4f-driver 1/1 Running 0 4s
# kubectl get pods -n spark-apps
NAME READY STATUS RESTARTS AGE
spark-pi-d9deaa905dea2f4f-driver 1/1 Running 0 11s
timtest-exec-1 0/1 ContainerCreating 0 1s
timtest-exec-2 0/1 ContainerCreating 0 1s
# kubectl get pods -n spark-apps
NAME READY STATUS RESTARTS AGE
spark-pi-d9deaa905dea2f4f-driver 1/1 Running 0 13s
timtest-exec-1 1/1 Running 0 3s
timtest-exec-2 1/1 Running 0 3s
# kubectl get pods -n spark-apps
NAME READY STATUS RESTARTS AGE
spark-pi-d9deaa905dea2f4f-driver 1/1 Running 0 16s
timtest-exec-1 1/1 Terminating 0 6s
timtest-exec-2 1/1 Running 0 6s
# kubectl get pods -n spark-apps
NAME READY STATUS RESTARTS AGE
spark-pi-d9deaa905dea2f4f-driver 0/1 Completed 0 18s
If I look at the logs of the executor pods, I can see:
spark 08:21:48.83 INFO ==>
spark 08:21:48.83 INFO ==> Welcome to the Bitnami spark container
spark 08:21:48.84 INFO ==> Subscribe to project updates by watching https://github.com/bitnami/containers
spark 08:21:48.84 INFO ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
spark 08:21:48.84 INFO ==> Upgrade to Tanzu Application Catalog for production environments to access custom-configured and pre-packaged software components. Gain enhanced features, including Software Bill of Materials (SBOM), CVE scan result reports, and VEX documents. To learn more, visit https://bitnami.com/enterprise
spark 08:21:48.84 INFO ==>
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
24/06/28 08:21:51 INFO KubernetesExecutorBackend: Started daemon with process name: 1@timtest-exec-1
24/06/28 08:21:51 INFO SignalUtils: Registering signal handler for TERM
24/06/28 08:21:51 INFO SignalUtils: Registering signal handler for HUP
24/06/28 08:21:51 INFO SignalUtils: Registering signal handler for INT
24/06/28 08:21:52 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
24/06/28 08:21:52 INFO SecurityManager: Changing view acls to: spark
24/06/28 08:21:52 INFO SecurityManager: Changing modify acls to: spark
24/06/28 08:21:52 INFO SecurityManager: Changing view acls groups to:
24/06/28 08:21:52 INFO SecurityManager: Changing modify acls groups to:
24/06/28 08:21:52 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: spark; groups with view permissions: EMPTY; users with modify permissions: spark; groups with modify permissions: EMPTY
24/06/28 08:21:53 INFO TransportClientFactory: Successfully created connection to spark-pi-4c19f3905def8cba-driver-svc.spark-apps.svc/10.244.36.70:7078 after 134 ms (0 ms spent in bootstraps)
24/06/28 08:21:53 INFO SecurityManager: Changing view acls to: spark
24/06/28 08:21:53 INFO SecurityManager: Changing modify acls to: spark
24/06/28 08:21:53 INFO SecurityManager: Changing view acls groups to:
24/06/28 08:21:53 INFO SecurityManager: Changing modify acls groups to:
24/06/28 08:21:53 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: spark; groups with view permissions: EMPTY; users with modify permissions: spark; groups with modify permissions: EMPTY
24/06/28 08:21:53 INFO TransportClientFactory: Successfully created connection to spark-pi-4c19f3905def8cba-driver-svc.spark-apps.svc/10.244.36.70:7078 after 3 ms (0 ms spent in bootstraps)
24/06/28 08:21:53 INFO DiskBlockManager: Created local directory at /var/data/spark-c83a972e-7fb9-41a1-aedf-011abedef1ef/blockmgr-fddb1182-8537-4c1d-a510-242c49c666b4
24/06/28 08:21:53 INFO MemoryStore: MemoryStore started with capacity 413.9 MiB
24/06/28 08:21:54 INFO CoarseGrainedExecutorBackend: Connecting to driver: spark://CoarseGrainedScheduler@spark-pi-4c19f3905def8cba-driver-svc.spark-apps.svc:7078
24/06/28 08:21:54 INFO ResourceUtils: ==============================================================
24/06/28 08:21:54 INFO ResourceUtils: No custom resources configured for spark.executor.
24/06/28 08:21:54 INFO ResourceUtils: ==============================================================
24/06/28 08:21:54 INFO CoarseGrainedExecutorBackend: Successfully registered with driver
24/06/28 08:21:54 INFO Executor: Starting executor ID 1 on host 10.244.98.121
24/06/28 08:21:54 INFO Executor: OS info Linux, 5.15.0-107-generic, amd64
24/06/28 08:21:54 INFO Executor: Java version 17.0.11
24/06/28 08:21:54 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 39283.
24/06/28 08:21:54 INFO NettyBlockTransferService: Server created on 10.244.98.121:39283
24/06/28 08:21:54 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
24/06/28 08:21:54 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(1, 10.244.98.121, 39283, None)
24/06/28 08:21:54 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(1, 10.244.98.121, 39283, None)
24/06/28 08:21:54 INFO BlockManager: Initialized BlockManager: BlockManagerId(1, 10.244.98.121, 39283, None)
24/06/28 08:21:54 INFO Executor: Starting executor with user classpath (userClassPathFirst = false): ''
24/06/28 08:21:54 INFO Executor: Created or updated repl class loader org.apache.spark.util.MutableURLClassLoader@6fb151d6 for default.
24/06/28 08:21:54 INFO Executor: Fetching file:/opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.1.jar with timestamp 1719562904826
24/06/28 08:21:54 INFO Utils: Copying /opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.1.jar to /var/data/spark-c83a972e-7fb9-41a1-aedf-011abedef1ef/spark-1bd7d629-411e-4bf5-945b-ee142934d83b/66015011719562904826_cache
24/06/28 08:21:54 INFO Utils: Copying /var/data/spark-c83a972e-7fb9-41a1-aedf-011abedef1ef/spark-1bd7d629-411e-4bf5-945b-ee142934d83b/66015011719562904826_cache to /opt/bitnami/spark/./spark-examples_2.12-3.5.1.jar
24/06/28 08:21:54 INFO Executor: Adding file:/opt/bitnami/spark/./spark-examples_2.12-3.5.1.jar to class loader default
24/06/28 08:21:55 INFO CoarseGrainedExecutorBackend: Got assigned task 1
24/06/28 08:21:55 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
24/06/28 08:21:56 INFO TorrentBroadcast: Started reading broadcast variable 0 with 1 pieces (estimated total size 4.0 MiB)
24/06/28 08:21:56 INFO TransportClientFactory: Successfully created connection to spark-pi-4c19f3905def8cba-driver-svc.spark-apps.svc/10.244.36.70:7079 after 13 ms (0 ms spent in bootstraps)
24/06/28 08:21:56 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 2.3 KiB, free 413.9 MiB)
24/06/28 08:21:56 INFO TorrentBroadcast: Reading broadcast variable 0 took 287 ms
24/06/28 08:21:56 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 4.0 KiB, free 413.9 MiB)
24/06/28 08:21:56 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 1055 bytes result sent to driver
24/06/28 08:21:58 INFO CoarseGrainedExecutorBackend: Driver commanded a shutdown
24/06/28 08:21:58 INFO MemoryStore: MemoryStore cleared
24/06/28 08:21:58 INFO BlockManager: BlockManager stopped
24/06/28 08:21:58 INFO ShutdownHookManager: Shutdown hook called
24/06/28 08:21:58 INFO ShutdownHookManager: Deleting directory /var/data/spark-c83a972e-7fb9-41a1-aedf-011abedef1ef/spark-1bd7d629-411e-4bf5-945b-ee142934d83b
@lgarg-kimbal
I've made my fixed image publicly available for you to try. If it works with this image, it is the same bug that is bothering you, even though you are using dockercompose.
verboistim/spark:3.5.1-debian-12-r7-bugfix
Rafael,
I already did last year:
https://github.com/bitnami/containers/pull/52661
But it was rejected by Michiel (mdhont) because it was not according to the strategy you guys used and that is fine by me. Just fix it another way then please.
Met vriendelijke groeten / Meilleures salutations / Best regards Tim Verbois Product Delivery Owner
From: Rafael Ríos Saavedra @.> Date: Tuesday, 2 July 2024 at 09:19 To: bitnami/containers @.> Cc: Tim Verbois @.>, Mention @.> Subject: Re: [bitnami/containers] [SPARK] lang.NullPointerException: invalid null input: name (Issue #52698)
Hi, Would you like to contribute an send a PR with you patch ? We will be glad to review and merge it.
— Reply to this email directly, view it on GitHubhttps://github.com/bitnami/containers/issues/52698#issuecomment-2202146279, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3NI4366M63D6TKDL4XEJLLZKJH7XAVCNFSM6AAAAAA7IWTB46VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMBSGE2DMMRXHE. You are receiving this because you were mentioned.Message ID: @.***>
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.
Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.
Hi, I am facing this issue as well. Any updates on this?
I found problem with
LD_PRELOAD=$LIBNSS_WRAPPER_PATH
This part was badly written.
I use k8s mode, so my workaround is
--conf spark.executorEnv.LD_PRELOAD=/opt/bitnami/common/lib/libnss_wrapper.so
Could you refactor this part? Right now spark not use /opt/bitnami/spark/conf/spark-env.sh with 'LD_PRELOAD'
Name and Version
bitnami/spark:3.5.0-debian-11-r12
What architecture are you using?
amd64
What steps will reproduce the bug?
I used the bitnami helm chart to setup spark.
Then I started an example application from a client:
It starts the driver and after that he starts spinning up executors:
Now, the executor will crash over and over again, being replaced by "-2", "-3", ... untill he finally gives up and exits. The pods remain in
Error
state.I was able to get this log from the executor pods:
What is the expected behavior?
It should not fail, it should just start
What do you see instead?
Additional information
I figured out through research on the internet, that the executor script tries to find the name of the user and since the user is not defined in
/etc/passwd
, this will fail, resulting in an empty "name" string. This means that the java command will get an "empty name variable", resulting in the error:A lot of users work around this by creating their own image, correcting the problem in a number of different ways, like add a line to
/etc/passwd
. Some even remove the "USER" in the dockerfile and create it as a root container (!!).I can imagine you want this fixed at the source (the spark script itself), but this means that until this is fixed, a lot of people are overriding your image by fixing it themselves. Why don't you add a working workaround now and get it out when they fix it at the source?