Open rossjones opened 9 years ago
I also note that I'm pretty sure Natural History Museum already did most of this work for their deployment.
Guessing based on column headers is how Esri does it in koop. On it "just working," keep in mind that PostGIS can output geometry in a variety of ways, so just including a geom
field in your SELECT
query will give you binary geometry output. You have to wrap it in ST_AsGeoJSON(geom)::json
to get it to output GeoJSON, which would probably make the most sense. If it helps, here's how I did it in my SODA API implementation - it checks if any of the fields in the SELECT
statement are of type geometry
and then wraps them in the above format.
Pinging this issue again, especially in light of the recent major datastore developments (semi-auto data dictionary; pgloader; download in alternate formats; performance improvements; triggers, etc.) and the CDO/CIO's Open Letter to the Open Data Community where they requested that geospatial data be treated as a first class data type.
We did a prototype implementation of async geocoding using background queue, using an extras metadata field to capture the state of the async geocoding job (request geocoding, being geocoded, geocoded) and then creating the geocoded file as a separate resource.
Perhaps, the community can collaborate on a geo-enabled datastore after the latest batch of datastore enhancements. This will be another big installment to move CKAN beyond "data catalog" type workloads to "data-as-infrastructure" workloads cc @amercader @wardi @davidread
https://github.com/NaturalHistoryMuseum/ckanext-dataspatial (geospatial searches; I think it assumes you have your geodata in the tables) https://github.com/derilinx/ckanext-vectorstorer / https://github.com/PublicaMundi/ckanext-vectorstorer (get GeoJSON and Shape files in via GeoServer)
The big question is how the workflow looks for getting the data in. CKAN's support for importing XLS/CSV etc. into a database table is good but not perfect. Really slick importing of many different Geo formats is a big task. The Vectorstorer extension does a good job but it could use more work and I'm not sure GeoServer is the best solution just for imports (we can call GDAL directly instead) unless you want the other benefits of GeoServer (tile previews, dynamic conversions to other formats).
What about using the-el? It's a command-line tool to extract and load SQL tables using a JSON Table Schema. It wraps a fork of a frictionless data tool to add geospatial, carto, and oracle support. So it could be used as a python library alternatively (or the underlying thing it forks at least). /cc @awm33
Suggested via https://twitter.com/timwis/status/631068707955961861
"Do you have any plans to add geospatial queries to DataStore API? I don't see it on the roadmap and surprised no one's asked"
Datastore
A possible starting point (and an assumption) is that if the datastore db is postgis, that
datastore_search_sql
will 'just work'. The other action methods would probably need further work (to ensure insertions etc).DataPusher
The real problem is likely the type-guessing in datapusher, it needs to be able to determine that the float it has just found is a lat, or that the string it has is geojson (for example). Perhaps as a first pass this could be done based on convention of column headers? It's a bit fragile but would simplify a first version.
Possibly related to #151