radanalyticsio / spark-operator

Operator for managing the Spark clusters on Kubernetes and OpenShift.
Apache License 2.0
157 stars 61 forks source link

spark session creation error -None.org.apache.spark.api.java.JavaSparkContext. : java.lang.NullPointerException #343

Open sajanraj opened 3 years ago

sajanraj commented 3 years ago

spark session creation error

IPYNB Jupyter notebook

from pyspark import SparkContext
from pyspark.sql import SparkSession, HiveContext
#config = pyspark.SparkConf().setAll([('spark.executor.memory', '8g'), ('spark.executor.cores', '5'), ('spark.cores.max', '16'), ('spark.driver.memory','8g')])
spark = SparkSession \
                    .builder \
                    .master("spark://192.168.9.51:30228") \
                    .appName("Python Spark SQL Hive integration example") \
                    .getOrCreate()

Description:

Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NullPointerException
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:608)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:238)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
    at py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.lang.Thread.run(Thread.java:748)

In spark master

21/07/01 05:26:37 ERROR TransportRequestHandler: Error while invoking RpcHandler#receive() for one-way message.
java.io.InvalidClassException: org.apache.spark.deploy.ApplicationDescription; local class incompatible: stream classdesc serialVersionUID = 1574364215946805297, local class serialVersionUID = 6543101073799644159

kubectl get all

NAME                                  READY   STATUS    RESTARTS   AGE
pod/continuous-image-puller-5gdvh     1/1     Running   0          5d7h
pod/discovery                         0/1     Evicted   0          52d
pod/hub-5685f48988-m4gqn              1/1     Running   0          5d6h
pod/my-spark-cluster-m-gvzqt          1/1     Running   0          29h
pod/my-spark-cluster-w-mcxxg          1/1     Running   0          29h
pod/my-spark-cluster-w-mwjw7          1/1     Running   0          29h
pod/postgres-5c7489f58-vckph          1/1     Running   1          9d
pod/proxy-75ccb77958-tkrq6            1/1     Running   0          5d7h
pod/spark-operator-844dd6bc95-v82br   1/1     Running   1          29h
pod/user-scheduler-7cb4878498-6nwhs   1/1     Running   0          5d7h
pod/user-scheduler-7cb4878498-x5cbr   1/1     Running   0          5d7h

NAME                                       DESIRED   CURRENT   READY   AGE
replicationcontroller/my-spark-cluster-m   1         1         1       29h
replicationcontroller/my-spark-cluster-w   2         2         2       29h

NAME                             TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)                         AGE
service/hub                      ClusterIP      10.97.124.21     <none>        8081/TCP                        5d7h
service/kubernetes               ClusterIP      10.96.0.1        <none>        443/TCP                         54d
service/my-spark-cluster         ClusterIP      10.107.102.116   <none>        7077/TCP                        29h
service/my-spark-cluster-m       NodePort       10.101.40.21     <none>        7077:30228/TCP,8080:31076/TCP   29h
service/my-spark-cluster-ui      ClusterIP      10.109.166.196   <none>        8080/TCP                        29h
service/postgres-service         ClusterIP      10.101.37.129    <none>        5432/TCP                        9d
service/proxy-api                ClusterIP      10.102.235.84    <none>        8001/TCP                        5d7h
service/proxy-public             LoadBalancer   10.107.165.235   <pending>     80:32361/TCP                    5d7h
service/spark-operator-metrics   ClusterIP      10.104.148.123   <none>        8080/TCP                        29h

NAME                                     DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/continuous-image-puller   1         1         1       1            1           <none>          5d7h

NAME                             READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/hub              1/1     1            1           5d7h
deployment.apps/postgres         1/1     1            1           9d
deployment.apps/proxy            1/1     1            1           5d7h
deployment.apps/spark-operator   1/1     1            1           29h
deployment.apps/user-scheduler   2/2     2            2           5d7h

NAME                                        DESIRED   CURRENT   READY   AGE
replicaset.apps/hub-5685f48988              1         1         1       5d7h
replicaset.apps/postgres-5c7489f58          1         1         1       9d
replicaset.apps/proxy-75ccb77958            1         1         1       5d7h
replicaset.apps/spark-operator-844dd6bc95   1         1         1       29h
replicaset.apps/user-scheduler-7cb4878498   2         2         2       5d7h

NAME                                READY   AGE
statefulset.apps/user-placeholder   0/0     5d7h

i can access web ui at http://192.168.9.51:31076/

URL: spark://192.168.79.150:7077
Alive Workers: 2
Cores in use: 2 Total, 0 Used
Memory in use: 13.5 GB Total, 0.0 B Used
Applications: 0 Running, 0 Completed
Drivers: 0 Running, 0 Completed
Status: ALIVE

Steps to reproduce:

  1. Jupyter is running outside kubernet, in local machine.
DEbydeepak commented 9 months ago

HI, how to sort the error, i am using jupyter on local machine. when not providing executor parameters for config then the session is not throwing any error and driver is acting as both master and worker