geotrellis / vectorpipe

Convert Vector data to VectorTiles with GeoTrellis.
https://geotrellis.github.io/vectorpipe/
Other
74 stars 20 forks source link

allGeoms and constructGeometries Don't Produce the Correct Schema #55

Open jbouffard opened 6 years ago

jbouffard commented 6 years ago

The OSMReader.allGeoms value is a DataFrame that is supposed to contain all of the Geometries from the source file; however, the geom column in the schema is of type Point instead of typeGeometry

scala> reader.allGeoms.schema.printTreeString
root
 |-- _type: byte (nullable = false)
 |-- id: long (nullable = true)
 |-- geom: point (nullable = true)
 |-- tags: map (nullable = true)
 |    |-- key: string
 |    |-- value: string (valueContainsNull = true)
 |-- changeset: long (nullable = true)
 |-- updated: timestamp (nullable = true)
 |-- validUntil: timestamp (nullable = true)
 |-- visible: boolean (nullable = true)
 |-- version: integer (nullable = true)
 |-- minorVersion: integer (nullable = true)

Because OSMReader.allGeoms uses the same logic as ProcessOSM.constructGeometries, that means then that the latter method also has this issue.

This problem is most likely the result of the union that occurs, and one possible solution would be to reverse it. However, it's not clear what kind of performance impacts, if any, would occur.

mojodna commented 6 years ago

Casting feels like the right solution here, but it doesn't look like it's possible to cast to a less specific type given the functions available. Maybe forget the Spark JTS casting and use Spark to cast to GeometryUDT in https://github.com/geotrellis/vectorpipe/pull/54/files#diff-f2b29a9e7b5c57acf2ea5f60fa9824eaR254 and related spots.