hello everyone,
I used spark to develop an example of a geospatial query,but there is some issues
1)In order to meet the needs of spark computing, i modified the nyc-boroughs.geojson file, each feature takes one line. the content of file like this:
{ "type": "", "id": , "properties": { }, "geometry": { "type": "", "coordinates": [ ] } }
{ "type": "", "id": , "properties": { }, "geometry": { "type": "", "coordinates": [ ] } }
…………
{ "type": "", "id": , "properties": { }, "geometry": { "type": "", "coordinates": [ ] } }
2)use the GeoJson.scala and RichGeometry.scala in the "ch08-geotime", my code like this:
val conf = new SparkConf().setAppName("geometry api test").setMaster("spark://master:7077")
val sc = new SparkContext(conf)
val input = sc.textFile("hdfs://master:8020/data/jsondata/nyc-boroughs.geojson")
val features = input.map(feature => feature.parseJson.convertTo[Feature])
val resultsRDD = features.filter(feature => feature.geometry.contains(new Point(-74.0059731, 40.7143528)))
resultsRDD.collect().foreach(feature => {
println(feature.properties)
println(feature.geometry.geometry.getType)
}
)
3) everything is ok in the process above, but when i run the last function(the bold line),i got a error like this:
WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, 172.16.50.75, executor 0): java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD
is there anyone give me some suggestion? thanks in advance.
hello everyone, I used spark to develop an example of a geospatial query,but there is some issues
1)In order to meet the needs of spark computing, i modified the nyc-boroughs.geojson file, each feature takes one line. the content of file like this: { "type": "", "id": , "properties": { }, "geometry": { "type": "", "coordinates": [ ] } } { "type": "", "id": , "properties": { }, "geometry": { "type": "", "coordinates": [ ] } } ………… { "type": "", "id": , "properties": { }, "geometry": { "type": "", "coordinates": [ ] } }
2)use the GeoJson.scala and RichGeometry.scala in the "ch08-geotime", my code like this: val conf = new SparkConf().setAppName("geometry api test").setMaster("spark://master:7077") val sc = new SparkContext(conf) val input = sc.textFile("hdfs://master:8020/data/jsondata/nyc-boroughs.geojson") val features = input.map(feature => feature.parseJson.convertTo[Feature]) val resultsRDD = features.filter(feature => feature.geometry.contains(new Point(-74.0059731, 40.7143528))) resultsRDD.collect().foreach(feature => { println(feature.properties) println(feature.geometry.geometry.getType) } )
3) everything is ok in the process above, but when i run the last function(the bold line),i got a error like this: WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, 172.16.50.75, executor 0): java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD
is there anyone give me some suggestion? thanks in advance.