simplexspatial / osm4scala

Scala and Spark library focused on reading OpenStreetMap Pbf files.
https://simplexspatial.github.io/osm4scala/
MIT License
81 stars 18 forks source link

Spark DataSets using OSMEntity case classes #75

Open angelcervera opened 3 years ago

angelcervera commented 3 years ago

At the moment, the Spark connector is using DataFrames. It would be useful to allow direct interaction between OSMEntity types and the Spark Connector using DataSets.

Something like these cases should work.

      import spark.implicits._
      val dataset = Seq(
        NodeEntity(1, 11, 10, Map({ "nodeId" -> "1"})),
        NodeEntity(2, 12, 20, Map({ "nodeId" -> "2"})),
        NodeEntity(3, 13, 30, Map.empty),
        NodeEntity(4, 14, 40, Map.empty),
        NodeEntity(5, 15, 50, Map.empty),
        NodeEntity(6, 16, 60, Map.empty),
        WayEntity(7, Seq(1,2,3,4), Map({ "wayId" -> "7"})),
        WayEntity(8, Seq(4,5,6), Map({ "wayId" -> "8"})),
      ).toDS()

      dataset.show()

Or

      import spark.implicits._
      val monaco = spark.sqlContext.read
        .format("osm.pbf")
        .load("src/test/resources/monaco.osm.pbf")
        .persist()

      monaco.as[OSMEntity]