healthysustainablecities / global-indicators

An open-source tool for calculating spatial indicators for healthy, sustainable cities worldwide using open or custom data.
MIT License
83 stars 34 forks source link

Should be able to construct pedestrian network from file without using Overpass #158

Open carlhiggs opened 1 year ago

carlhiggs commented 1 year ago

Currently the use of OSMnx to derive a routable network uses the graph from polygon function to retrieve the network via the Overpass API. This has several disadvantages

The requirement of an internet connection is a minor inconvenience. The temporal mismatch technically creates issues for reproduceability, as any modification to the network in or near urban populated areas will result in new sampling points and differences in the accessibility estimates associated with these, with these differences cascading to aggregate results. The differences might be small (particularly if the target time point and time of analysis are not too far apart).

However, ideally we would derive edges, nodes and clean intersections from the OpenStreetMap archive file directly, which will address the above disadvantages. When we get time, this would be good to figure out how to do. The reason it wasn't done in the first instance is that it wasn't previously obvious how to implement a custom network definition when using the graph from file function. However, it may be that either this is now more possible, or if we looked at the probem again we'd find a solution.

carlhiggs commented 1 year ago

A promising approach to this may be to use PyrOSM which is designed for constructing networks for cities using pbf files, has a similar custom filter syntax, and is developed by Henrikki Tenkanen who is a colleague of @VuokkoH . We could still use OSMnx for intersection cleaning (and there's more to explore there, e.g. reconstructing the graph after simplification). Just wanted to record this thought here for something to think about after we've sorted the more pressing changes, as this will make things conceptually neater, drawing the network from an offline copy of OpenStreetMap.

As per here a custom network can be retrieved after loading a pbf file, which according to the reference manual that can be restricted to a bounding box which we could construct around a buffered study region. So, seems do-able I think?

shiqin-liu commented 1 year ago

Per Geoff notes, OSMnx has features for retrieving OSM Data for Specific Data, see https://github.com/gboeing/osmnx/issues/384

carlhiggs commented 1 year ago

Excellent, thanks Shirley --- this could be good to look into implementing a way to customise date of retrieval. I followed your link and see that this now looks to be done using OSMnx Overpass settings: https://osmnx.readthedocs.io/en/stable/osmnx.html#module-osmnx.settings

overpass_settings : string Settings string for Overpass queries. Default is “[out:json][timeout:{timeout}]{maxsize}”. By default, the {timeout} and {maxsize} values are set dynamically by OSMnx when used. To query, for example, historical OSM data as of a certain date: ‘[out:json][timeout:90][date:”2019-10-28T19:20:00Z”]’. Use with caution.

It might not be trivial to get users to input a correct date that works perhaps -- it would be worth testing out how loose that format could be (e.g. if we asked users for just yyyy-mm-dd, could that be formatted in a way that would still work?). I suspect the 'use with caution' warning will partly relate to that. Still, could be a good option as a quick fix for the temporal issue while we look to implement other options we discussed (loading from PBF / loading a non-OSM network dataset, eg official roads where prefered)

carlhiggs commented 1 year ago

We previously implemented retrieval of network to match the downloaded OSM file data, but I was reminded of this overall issue just now, as Overpass server is down returning a 504 error--

2023-04-06 09:53:23 Unable to query https://overpass-api.de/api/status, got status 504 2023-04-06 09:53:23 Pausing 60 seconds before making HTTP POST request

Ideally we would load network from file already downloaded, and so be more resilient to this

carlhiggs commented 1 year ago

In the updates above, I included PyrOSM and Momepy in the latest build of our Docker software environment (v4.4.0) in order to support us developing some options for users to

In principle either approach should now work, with the network stored in postgis and retrieved as a geodataframe that may be processed in our neighbourhood analysis workflow, assuming the appropriate attributes are present.

In practice - I had a quick look at PyrOSM and ran into some difficulties with our current software stack; as per issue https://github.com/HTenkanen/pyrosm/issues/210 , it seems that the latest release (v0.6.1) requires Numpy < 1.25.0, but that is the version we are using. I am reluctant to downgrade numpy due to potential for other issues, and knowing that (as per the linked issue) the main branch for PyrOSM apparently already allows for Numpy v1.25.0, but just hasn't been made a formal release yet.

Perhaps @VuokkoH might know if its realistic to expect a further update from her colleague for PyrOSM? It might be best to wait until then before drafting a function to use PyrOSM for loading networks with custom definitions for walking from the source .pbf file.

I also included Momepy, as in principle it can also be used to load graphs from geodataframes and some of its urban morphology indicators could be of interest to some users potentially.

So, both these software packages are now in the GHSCI software stack, so that others can use them or develop functionality around them should they need to.

VuokkoH commented 1 year ago

Hi! Quick comment: PyrOSM is great; I've used it with Finland-wide pre-downloaded PBF files for regional accessibility analyses before.

I'm sure Henrikki is willing to look into the issue (time allowing, of course) and anyways excited to hear of this use case. So, I'd recommend that you @carlhiggs post an issue about this to PyrOSM repo (if you already didn't do so). I can also reach out to him via other channels. I guess it should not be a huge amount of work to do this upgrade if it is already implemented, but not yet made into a release?

carlhiggs commented 1 year ago

Thanks @VuokkoH , another user has posted an issue about this and asked Henrikki ( https://github.com/HTenkanen/pyrosm/issues/210#issuecomment-1574939605 ) so figured he must be busy with other things and maybe missed the GitHub notification.

If you wouldn't mind scoping out if he is planning an updated release in the near future that would be awesome - no worries if not, just will be good to know his plans. Thanks!

VuokkoH commented 1 year ago

ah, indeed. He's now back from holidays and pyrOSM is in the list of things to do in July, so I heard :)

carlhiggs commented 1 year ago

In the Cycling indicators working group I recently shared a link to a recently published article on a tool called BikeDNA that provides interesting functionality for network verification. Although specifically focused on cycling, there seem to be a number of useful functions --- one in particular is a [create_osmnx_graph] function that can take a linestring geodataframe and return a graph representation of this that is compatible with OSMnx. Its not identical as far as I can tell (I fed edges derived from OSMnx into it to compare, and the length values for distance in metres were slightly different by a few metrres or so -- less than 10% different), but it was pretty close. Seems to use momepy for some of the conversion.

Anyway --- for now I have constructed v4.4.8 of the Docker image with the BikeDNA_BIG pre installed (this also contains pyrOSM).

Until I write helper functions and a configuration mechanic to optionally point to custom data, this should allow those who want to experiment with externally defined networks (including using more advanced OSMnx operations, and/or for cycling networks) to do so.

Vierø, A. R., Vybornova, A., & Szell, M. (2023). BikeDNA: A tool for bicycle infrastructure data and network assessment. Environment and Planning B: Urban Analytics and City Science, 0(0). https://doi.org/10.1177/23998083231184471

There are two associated repositories; the second is optimised for large scale anlaysis: https://github.com/anerv/BikeDNA https://github.com/anerv/BikeDNA_BIG

To bring in the create_osmnx_graph function, you can do, from src.graph_functions import create_osmnx_graph Then, you pass a gdf. The output could then be saved using the OSMnx readable format back to a gdfs and written to postgis for use in analyses.

So, using that approach should yield capacity for loading custom networks.

Accompanying it, perhaps we should think of more sophisticated ways of tracking the libraries used for citation purposes...