geospatial query with spark

hello everyone, I used spark to develop an example of a geospatial query，but there is some issues

1)In order to meet the needs of spark computing, i modified the nyc-boroughs.geojson file, each feature takes one line. the content of file like this: { "type": "", "id": , "properties": { }, "geometry": { "type": "", "coordinates": [ ] } } { "type": "", "id": , "properties": { }, "geometry": { "type": "", "coordinates": [ ] } } ………… { "type": "", "id": , "properties": { }, "geometry": { "type": "", "coordinates": [ ] } }

2)use the GeoJson.scala and RichGeometry.scala in the "ch08-geotime", my code like this: val conf = new SparkConf().setAppName("geometry api test").setMaster("spark://master:7077") val sc = new SparkContext(conf) val input = sc.textFile("hdfs://master:8020/data/jsondata/nyc-boroughs.geojson") val features = input.map(feature => feature.parseJson.convertTo[Feature]) val resultsRDD = features.filter(feature => feature.geometry.contains(new Point(-74.0059731, 40.7143528))) resultsRDD.collect().foreach(feature => { println(feature.properties) println(feature.geometry.geometry.getType) } )

3) everything is ok in the process above, but when i run the last function(the bold line),i got a error like this: WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, 172.16.50.75, executor 0): java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD

is there anyone give me some suggestion? thanks in advance.

sryza / aas

geospatial query with spark #118