harsha2010 / magellan

Geo Spatial Data Analytics on Spark
Apache License 2.0
533 stars 149 forks source link

java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericInternalRow cannot be cast to java.lang.Double #177

Closed mdbuck closed 6 years ago

mdbuck commented 6 years ago

Started to use Magellan and ran into the above exception. Basically what the code is doing is:

val csv = spark.read.option("delimiter", ",").option("header", value = false).csv(polygonsUrl.toExternalForm)

val polygons = csv.select(expressions.wkt(csv("_c0")).as("polygon"), csv("_c1"))

import polygons.sqlContext.implicits._
val frame = polygons.select(point(lit(0d), lit(1d)) intersects $"polygon" )

I expected all falses however, running the attached application yields the above exception. The attached .zip file contains a simple driver application along with the polygons.csv as well as gradle build script. The build scripts are only helpful in informing you of the dependencies I am using.

[MagellanPolygonTest1.zip] (https://github.com/harsha2010/magellan/files/1333624/MagellanPolygonTest1.zip)

harsha2010 commented 6 years ago

@mdbuck can you convert the lat/ long to doubles? i think you are using strings

mdbuck commented 6 years ago

Not sure what you mean. The point in the SELECT is using doubles as indicated by the 'd' suffix in the lit() expressions.

After some more investigation I discovered that the wkt expression returns a StructType and not a Polygon like I assumed. I will investigate further.

Thanks.

mdbuck commented 6 years ago

Looks like my understanding of wkt is correct so I will close this ticket. I would like to update the documentation for Magellan's WKT behavior and usage pattern.

Thanks for your time.