The Vector Tile Spark Process allows developers and data scientists clip geographic data into Hadoop SequeueFiles on Spark platform.
The effect comes from Preview Example.
$ mvn clean && mvn package
$SPARK_HOME/bin/spark-submit --class org.ieee.codemeow.geometric.spark.VectorTileTask --master yarn --deploy-mode cluster --jars /path/to/postgresql-42.0.0.jar --driver-class-path /path/to/postgresql-42.0.0.jar /path/to/vectortile-spark-process-1.0-SNAPSHOT.jar hdfs:///path/to/vectortile-spark-process.yml
---
# vectortile-spark-process.yml
appName: "Vector Tile Process"
sequenceFileDir: "hdfs:///path/to"
layers:
- layerName: "layerName"
minZoom: "0"
maxZoom: "22"
dataProvider: "org.ieee.codemeow.geometric.spark.data.SQLDataProvider"
kwargs:
url: "jdbc:postgresql://hostname/dbname"
dbtables:
planet_osm_line: "public.planet_osm_line"
planet_osm_point: "public.planet_osm_point"
planet_osm_polygon: "public.planet_osm_polygon"
planet_osm_roads: "public.planet_osm_roads"
user: "postgres"
password: "postgres"
zooms:
0: "SELECT osm_id AS __id__, ST_GeomFromWKB(way) AS __geometry__ FROM ..."
1: "SELECT osm_id AS __id__, ST_GeomFromWKB(way) AS __geometry__ FROM ..."
...
22: "SELECT osm_id AS __id__, ST_GeomFromWKB(way) AS __geometry__ FROM ..."
Find a bug or want to request a new feature? Please let us know by submitting an issue.
Upgrade protobuf package version on your Spark cluster
cp protobuf-java-3.0.0-beta-2.jar $SPARK_HOME/jars
Use SparkSQL in the zooms section of the configuration file