add GeoJSON data export option

getodk / central

ODK Central is a server that is easy to use, very fast, and stuffed with features that make data collection easier. Contribute and make the world a better place! ✨🗄✨

https://docs.getodk.org/central-intro/

Apache License 2.0

125 stars 156 forks source link

add GeoJSON data export option #27

Open danbjoseph opened 6 years ago

danbjoseph commented 6 years ago

consider implementing GeoJSON export as an option after this functionality is implemented in Aggregate and Briefcase?

see opendatakit/roadmap#26

issa-tseng commented 6 years ago

good news—the OData output is already a JSON format, and in OData geographic data values are outputted in conformance to GeoJSON!

lognaturel commented 6 years ago

@clint-tseng do you have a suggested process for using the OData field in a context when a data cleaning step is needed? Like does PowerBI or an other tool make it easy to do an edit and export from there? Or maybe you'd recommend going through a JSON to CSV converter (since directly downloading the CSV provides geo data in ODK format rather than GeoJSON).

Are tools that don't speak OData specifically but have some kind of JSON ingestion likely to be able to do something with the OData output? For example, something like QGIS ( https://webgeodatavore.com/add-geojson-content-in-qgis-short-recipes.html)? My understanding is that those will generally require a GeoJSON-only feed (as opposed to JSON with embedded geoJSON).

@danbjoseph it would help to know more about what an ideal workflow might be for you and what tools you'd like to use for analysis to decide what, if any, action is needed.

issa-tseng commented 6 years ago

outside of a JSON container format i would not recommend employing GeoJSON as an atomic storage value. WKT is a much more fluent geography standard for inline text formats like CSV. in general, though, most data cleaning tools are very agnostic to the container format—and i'm not following why that question is coming up in this context.

GeoJSON is really wishy washy. the main issue with making our data "more geojsony" than it is now is not actually the container so much as it is the fact that we don't necessarily understand what data values to associate with which geography. what are we supposed to do, for instance, if there are ten flat fields in a form and three of them are geopoints? where do the remaining seven properties get assigned?

without a deeper semantic knowledge of the data going into the form, these questions can't be answered by either aggregate nor central.

lognaturel commented 6 years ago

in general, though, most data cleaning tools are very agnostic to the container format—and i'm not following why that question is coming up in this context.

I'm asking because a reason users may want geo-focused exports is because they don't expect to use the raw data in analysis and in that case it's not clear what to do with the OData feed.

what are we supposed to do, for instance, if there are ten flat fields in a form and three of them are geopoints? where do the remaining seven properties get assigned?

There's a broader conversation going on at https://forum.opendatakit.org/t/add-a-geojson-export-to-briefcase-and-aggregate/15184. @danbjoseph is more knowledgeable in this area and can likely provide more background. My understanding is that users find it useful to have just the geo data without any link to the form content.

issa-tseng commented 6 years ago

curious. well, in general i am reluctant to add five million export formats and formulations to support.. i'd sooner provide eg kettle or other translation systems from OData to other desired formats.

issa-tseng commented 5 years ago

going through all Central issues and reading this again;

i am still interested in figuring out a solution here that preserves OData as the base export format. i understand the approach taken by Aggregate but it puts a lot of onus upon the eventual consumer to correlate the geographic information with the datapoints associated with each geographic feature. this is not really the norm in geospatial processing, as cartographic information is usually built upon these auxiliary pieces of data (eg color code points by category, size them by some data value, etc).

i am not trying to be and trying not to be the Central Czar, but i do try to advocate consistently for careful, sustainable solutions that benefit all users, regardless of technical data-processing ability. perhaps if we have a few voices in this thread that need to use geospatial tools with their Central-based data, but do not have the ability to do this already with the data we currently provide (which a lot of our more vocal users could do, albeit at an inconvenience), we can begin to work out what that solution might look like.

lognaturel commented 5 years ago

The answer could possibly be "go through Briefcase."

danbjoseph commented 5 years ago

Personally, I can use JSON or CSV just fine. And I agree that figuring out how to do GeoJSON in a convenient, consistently useful way is hard (surveys with more than one geopoint, surveys with a geopoint but not for all questions, etc).

florianm commented 3 years ago

Recent post on the forum: https://forum.getodk.org/t/geopoint-data-import-records-to-qgis-using-virtual-layers/34969

The GeoJSON export could accept a parameter to specify a field name for the geometry, or use the first geofield as default.

florianm commented 3 years ago

Not that R-literate users need much help in this direction, but the recently updated ruODK vignette "spatial" provides working examples for spatial data.

Talking points:

Which of possibly many geofields will become the feature's geometry? Answer: the user has to decide.
How to go from ODK Central WKT or GeoJSON output to spatially enabled data structures in R (e.g. objects of class sf) and maps - working examples provided.