opencypher / morpheus

Morpheus brings the leading graph query language, Cypher, onto the leading distributed processing platform, Spark.
Apache License 2.0
331 stars 64 forks source link

Could not initialize class org.opencypher. morpheus.impl.MorpheusFunctions #948

Open lakshmanok opened 4 years ago

lakshmanok commented 4 years ago

The following code, based on https://github.com/opencypher/morpheus/blob/0.4.2/morpheus-examples/src/main/scala/org/opencypher/morpheus/examples/CypherSQLRoundtripExample.scala :

val result = graph.cypher("""
  |MATCH
  | (t1:Trip)-[:CONTAINS]-(s1:Stop)
  |RETURN t1.id AS trip_id
""".stripMargin)
result.records.asMorpheus.df.toDF("trip_id").createOrReplaceTempView("results")

gives me this error: error: value asMorpheus is not a member of result.Records

lakshmanok commented 4 years ago

Looking at this further, it seems to be a problem with some dependency not being loaded. I'm running the program from the Spark Scala shell as folllows:

spark-shell --packages=org.opencypher:morpheus-spark-cypher:0.4.2

The output indicates that Morpheus is getting installed and I can use classes like MorpheusNodeTable without any problem.

However, doing "result.records" throws this initialization error:

java.lang.NoClassDefFoundError: Could not initialize class org.opencypher.
morpheus.impl.MorpheusFunctions$
  at org.opencypher.morpheus.impl.SparkSQLExprMapper$RichExpression.nullSa
feConversion(SparkSQLExprMapper.scala:66)
  at org.opencypher.morpheus.impl.SparkSQLExprMapper$RichExpression.asSpar
kSQLExpr(SparkSQLExprMapper.scala:97)
  at org.opencypher.morpheus.impl.table.SparkTable$DataFrameTable.$anonfun
$withColumns$2(SparkTable.scala:85)
  at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala
:126)
  at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scal
a:122)
  at scala.collection.immutable.List.foldLeft(List.scala:89)
  at org.opencypher.morpheus.impl.table.SparkTable$DataFrameTable.withColu
mns(SparkTable.scala:84)
  at org.opencypher.morpheus.impl.table.SparkTable$DataFrameTable.withColu
mns(SparkTable.scala:53)
  at org.opencypher.okapi.relational.impl.operators.AddInto._table$lzycomp
ute(RelationalOperator.scala:262)
  at org.opencypher.okapi.relational.impl.operators.AddInto._table(Relatio
nalOperator.scala:260)
...

I am able to do import org.opencypher.morpheus.impl.MorpheusFunctions so the problem is likely with some static initialization that this does, and is not captured in the Maven package ...

MarcianoAvihay commented 4 years ago

Hi ... i suggest not using the maven package, and instead using the FatJar latest release from this repository / building this repo ( i initially started with the fat jar and had no issues , and later switched to building myself - for adding spark 3 and solving some issues / bugs iv'e encountered)

lakshmanok commented 4 years ago

Okay, found the problem. The problem is with line 53 of the MorpheusFunctions initalizer:

https://github.com/opencypher/morpheus/blob/master/morpheus-spark-cypher/src/main/scala/org/opencypher/morpheus/impl/MorpheusFunctions.scala#L53

Start the shell with

spark-shell --packages=org.opencypher:morpheus-spark-cypher:0.4.2 --conf spark.sql.legacy.allowUntypedScalaUDF=True

to workaround this problem.

Another way to set this flag is to do:

spark.conf.set("spark.sql.legacy.allowUntypedScalaUDF", true)