hotosm / osm-fieldwork

Processing field data from ODK to OpenStreetMap format, and other field data collection utils.
GNU Affero General Public License v3.0
16 stars 78 forks source link

Use the Underpass raw OSM database #79

Closed robsavoye closed 1 year ago

robsavoye commented 1 year ago

Currently data extracts expect to use a local postgres database with all the data in it. This does not scale. Since HOT maintains a raw OSM database updated every minute, data extracts should come from that database, as it contains the entire planet. The raw database support a REST API that uses a YAML config file to create the SQL query. Currently the make_data_extract.py program has the SQL queries, so this change it to query our remote database, instead of a local one.

robsavoye commented 1 year ago

This turns out to be a big mess... When using select_from_file, any tag or value that isn't in the choices sheet prevents ODK Collect from launching. Up till now the data extracts have been easy, as the only tags were building=yes. I did a test yesterday in the Thamel area of Kathmandu, and many of the hotels have ore than a dozen tags! Often unofficial ones too that don't really exist in OSM. While it would be nice to have ODK Collect ignore the extraneous tags, I think for now we'll need a filter program to clean up the tags & values so ODK Collect will launch.

robsavoye commented 1 year ago

I now have a program that extracts all the tags & values in the HOT data models document, and then uses that to filter the data extract and remove weird tags and values so ODK Collect can load the data extract. Since the XLSForm library implements the HOT data models, this should keep them in sync.

robsavoye commented 1 year ago

This is now in main, so now also accessible from FMTM.