ywilkof / spark-jobs-rest-client

Fluent client for interacting with Spark Standalone Mode's Rest API for submitting, killing and monitoring the state of jobs.
Apache License 2.0
108 stars 58 forks source link

Application not able to find jars #6

Closed aminaaslam closed 8 years ago

aminaaslam commented 8 years ago

Hi ywilkof, I am trying to submit an application that has multiple dependencies. I am providing the jars using this method usingJars(); This is how my submit request looks like final String jars = "/root/mb4spark/dist/MB4Spark-1.0.0-a1/MB4Spark-1.0.0-a1.jar,"

this string is then tokenized and added to a hashset; which is then passed into usingJars() method. My spark cluster also runs on my local machine.

final String submissionId=task.getSparkRestClient().prepareJobSubmit().appName(task.getAppName()) .mainClass("com.baesystems.ai.analytics.ModelingEngine") .usingJars(task.getJarSet()) .appArgs(task.getAppArguments()) .appResource(task.getAppResources()) .withProperties() .put("spark.eventLog.enabled", "true") .put("spark.dynamicAllocation.enabled","true") .put("spark.shuffle.service.enabled","true") .put("spark.dynamicAllocation.minExecutors", "1") .put("spark.dynamicAllocation.maxExecutors","2") .put("spark.user.dir","/opt/mb4spark") .put("spark.local.dir","/tmp") .put("spark.hadoop.validateOutputSpecs","false") .put("spark.executor.extraJavaOptions","-XX:+PrintGCDetails -XX:+PrintGCTimeStamps") .submit();

but it keeps complaining about classNotFound Exception. Here is the error that i see on the worker.

Launch Command: "/opt/jdk1.8.0_102/bin/java" "-cp" "/usr/local/spark/conf/:/usr/local/spark/jars/*" "-Xmx1024M" "-Dspark.dynamicAllocation.minExecutors=1" "-Dspark.master=spark://0.0.0.0:7077" "-Dspark.jars=/root/mb4spark/dist/MB4Spark-1.0.0-a1/MB4Spark-1.0.0-a1.jar,/root/mb4spark/dist/MB4Spark-1.0.0-a1/lib/auto-common-0.3.jar,/root/mb4spark/dist/MB4Spark-1.0.0-a1/lib/auto-service-1.0-rc2.jar,/root/mb4spark/dist/MB4Spark-1.0.0-a1/lib/commons-io-2.5.jar,/root/mb4spark/dist/MB4Spark-1.0.0-a1/lib/fast-classpath-scanner-1.99.0.jar,/root/mb4spark/dist/MB4Spark-1.0.0-a1/lib/fastutil-7.0.12.jar,/root/mb4spark/dist/MB4Spark-1.0.0-a1/lib/guava-19.0.jar,/root/mb4spark/dist/MB4Spark-1.0.0-a1/lib/jackson-core-2.8.1.jar,/root/mb4spark/dist/MB4Spark-1.0.0-a1/lib/jpmml-converter-1.0.8.jar,/root/mb4spark/dist/MB4Spark-1.0.0-a1/lib/jpmml-sparkml-1.1-SNAPSHOT.jar,/root/mb4spark/dist/MB4Spark-1.0.0-a1/lib/modeling-engine-common-1.0.0.jar,/root/mb4spark/dist/MB4Spark-1.0.0-a1/lib/pmml-evaluator-1.2.15.jar,/root/mb4spark/dist/MB4Spark-1.0.0-a1/lib/spark-MDLP-discretization-1.2.12.jar,file:/root/mb4spark/dist/MB4Spark-1.0.0-a1/lib/json-simple-1.1.1.jar" "-Dspark.local.dir=/tmp" "-Dspark.shuffle.service.enabled=true" "-Dspark.user.dir=/opt/mb4spark" "-Dspark.dynamicAllocation.maxExecutors=2" "-Dspark.app.name=Test Relevance" "-Dspark.hadoop.validateOutputSpecs=false" "-Dspark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" "-Dspark.dynamicAllocation.enabled=true" "-Dspark.eventLog.enabled=false" "org.apache.spark.deploy.worker.DriverWrapper" "spark://Worker@172.16.166.241:32859" "/usr/local/spark/work/driver-20160913134535-0001/modeling-engine-common-1.0.0.jar" "com.baesystems.ai.analytics.ModelingEngine" "/root/mb4spark/capabilities-modeling-config.json"

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 16/09/13 13:45:40 WARN Utils: Your hostname, localhost resolves to a loopback address: 127.0.0.1; using 172.16.166.241 instead (on interface eno16777736) 16/09/13 13:45:40 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 16/09/13 13:45:40 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/09/13 13:45:40 INFO SecurityManager: Changing view acls to: root 16/09/13 13:45:40 INFO SecurityManager: Changing modify acls to: root 16/09/13 13:45:40 INFO SecurityManager: Changing view acls groups to: 16/09/13 13:45:40 INFO SecurityManager: Changing modify acls groups to: 16/09/13 13:45:40 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 16/09/13 13:45:41 INFO Utils: Successfully started service 'Driver' on port 32870. 16/09/13 13:45:41 INFO WorkerWatcher: Connecting to worker spark://Worker@172.16.166.241:32859 Exception in thread "main" java.lang.NoClassDefFoundError: org/json/simple/parser/ParseException at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.util.Utils$.classForName(Utils.scala:225) at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:56) at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala) Caused by: java.lang.ClassNotFoundException: org.json.simple.parser.ParseException at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 5 more

ronyv89 commented 6 years ago

How did you fix this

avichaym commented 6 years ago

@ronyv89 , @aminaaslam , i am also encountering the same issue with this client ... did you find a way to make it recognize those jars ?