ericsun95 / spark-osm-pbf

OSM PBF Data Source for Apache Spark
Apache License 2.0
1 stars 0 forks source link

Explore DataSourceV2 API #3

Open ericsun95 opened 3 years ago

ericsun95 commented 3 years ago

Explore DataSourceV2 API and come up with a plan.

ericsun95 commented 3 years ago

This is the doc describing datasource v2 in general. Implementations should at least use ReadSupport or WriteSupport interfaces for readable or writable data sources, respectively. https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-DataSourceV2.html

ericsun95 commented 3 years ago

ReadSupport defines a single createReader method that creates a DataSourceReader.

DataSourceReader createReader(DataSourceOptions options)
DataSourceReader createReader(StructType schema, DataSourceOptions options)
ericsun95 commented 3 years ago

Add a initial support for DataSourceV2. Will upgrade more for the other relevant optimization.