databricks / spark-corenlp

Stanford CoreNLP wrapper for Apache Spark
GNU General Public License v3.0
422 stars 120 forks source link

Getting error Exception in thread "main" java.lang.NoSuchMethodError #25

Closed raghugvt closed 7 years ago

raghugvt commented 7 years ago

Hi,

I am trying to run Spark-CoreNLP but getting the following error:

Exception in thread "main" java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaMirrors$JavaMirror; at com.databricks.spark.corenlp.functions$.cleanxml(functions.scala:54)

SBT configurations looks as below:

scalaVersion := "2.11.8"

libraryDependencies ++= Seq("org.apache.spark" % "spark-core_2.11" % "2.0.0", "org.apache.spark" % "spark-sql_2.11" % "2.0.0", "edu.stanford.nlp" % "stanford-corenlp" % "3.6.0", "edu.stanford.nlp" % "stanford-corenlp" % "3.6.0" % "test" classifier "models", "databricks" % "spark-corenlp" % "0.2.0-s_2.11" )

I am running on standalone Spark cluster 2.0.0 (scala 2.11.8) using:

spark-submit --jars C:\Users\raghugvt\.ivy2\cache\databricks\spark-corenlp\jars\spark-corenlp-0.1.jar,C:\Users\raghugvt\.ivy2\cache\edu.stanford.nlp\stanford-corenlp\jars\stanford-corenlp-3.6.0-models.jar,C:\Users\raghugvt\.ivy2\cache\edu.stanford.nlp\stanford-corenlp\jars\stanford-corenlp-3.6.0.jar --class SparkStanfordNLPTest --master local[2] target\scala-2.11\TestSparkCoreNLP_2.11-1.0.jar

Please help.

raghugvt commented 7 years ago

I was able to solve this issue. Here is what I followed to fix it:

  1. I used the following SBT configurations (build.sbt): version := "1.0"

scalaVersion := "2.11.8"

libraryDependencies ++= Seq(("org.apache.spark" % "spark-core_2.11" % "2.0.0" % "provided"). exclude("org.mortbay.jetty", "servlet-api"). exclude("commons-beanutils", "commons-beanutils-core"). exclude("commons-collections", "commons-collections"). exclude("commons-collections", "commons-collections"). exclude("com.esotericsoftware.minlog", "minlog"), "org.apache.spark" % "spark-sql_2.11" % "2.0.0" % "provided", "edu.stanford.nlp" % "stanford-corenlp" % "3.7.0" , "edu.stanford.nlp" % "stanford-corenlp" % "3.7.0" % "test" classifier "models", "databricks" % "spark-corenlp" % "0.2.0-s_2.11" )

resolvers += "SparkPackages" at "https://dl.bintray.com/spark-packages/maven/" resolvers += Resolver.url("bintray-sbt-plugins", url("http://dl.bintray.com/sbt/sbt-plugin-releases"))(Resolver.ivyStylePatterns)

  1. I used sbt-assembly plugin to include spark-corenlp and stanford-corenlp jar files along with application jar. Once the application jar contains all the dependent jar files, using spark-submit we can submit the job to spark cluster. By this way we are eliminating the error: NoSuchMethodError, which otherwise was thrown by Spark environment when running in clustered mode. A. Included a file project/plugin.sbt with contents: addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.3") B. Create a FAT jar using sbt assembly

  2. Spark-submit command /opt/spark-2.1.0-bin-hadoop2.7/bin/spark-submit --jars /path/to/stanford-corenlp-3.7.0.jar,/path/to/stanford_corenlp/stanford-corenlp-3.7.0-models.jar --class SparkStanfordNLPTest --master spark://spark.cluster.ip:7077 SparkCoreNLPTest-assembly-1.0.jar

xiaohai2016 commented 7 years ago

I got the same error, when following the instruction found at: https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/1233855/3233914658600709/588180/latest.html. The only difference is that I used Scala 2.11 and corenlp version s_2.11. Thanks.

gayathridk commented 7 years ago

Though I followed the above procedure, I am getting the below error when i run the scala code: not found: value sqlContext. Code: import org.apache.spark.sql.functions. import com.databricks.spark.corenlp.functions. import org.apache.spark.{SparkConf, SparkContext} object classification { val conf = new SparkConf().setAppName("classification").setMaster("local") val sc = new SparkContext(conf) import sqlContext.implicits._ def main(args: Array[String]) { val input = Seq((1, "Stanford University is located in California. I$ val output = input.select(cleanxml('text).as('doc)).select(explode(ssplit$ output.show(truncate = false) } }