datahq / dataflows

DataFlows is a simple, intuitive lightweight framework for building data processing flows in python.
https://dataflows.org
MIT License
193 stars 39 forks source link

Supporting GeoJSON #152

Open gperonato opened 3 years ago

gperonato commented 3 years ago

Is GeoJSON support in the roadmap? Dumping to GeoJSON would be a nice feature to have for datasets containing geographic features

gperonato commented 3 years ago

First attempt in #153

akariv commented 3 years ago

Great suggestion - let's work on that PR together.

n0rdlicht commented 3 years ago

As far as I can tell from the PR this helps writing out point based GeoJSONs. Did you think about also using MultiPoint / LineString / Polygon / MultiLineString / MultiPolygon geometry, which the spec currently doesn't account for except for wrapping it in a GeoJSON object per row.

gperonato commented 3 years ago

Hi @n0rdlicht, indeed you would need a geojson object per row for including geometries other than geopoints. The current spec is based on the Frictionless table schema, which only supports geopoint and geojson. I think this is good to provide compatibility with CSV files, where you usually do not store complex geometries (or if you do, I think it's a good practice to include a geojson object). But indeed, it would be interesting to accept generic GeoJSON files (including then the geometry types you mentioned) as source files in the pipeline.

n0rdlicht commented 3 years ago

Hi @n0rdlicht, indeed you would need a geojson object per row for including geometries other than geopoints. The current spec is based on the Frictionless table schema, which only supports geopoint and geojson. I think this is good to provide compatibility with CSV files, where you usually do not store complex geometries (or if you do, I think it's a good practice to include a geojson object). But indeed, it would be interesting to accept generic GeoJSON files (including then the geometry types you mentioned) as source files in the pipeline.

Fully agree with this. What would be your thoughts on using WKT Strings as a second option?

gperonato commented 3 years ago

Definitely a very good idea! But maybe this should be discussed before in the frictionless repositories/discord channel, and only subsequently integrated in dataflows and/or datapackage-pipelines.