cmu-lib / bridgesofPittsburgh

Code and documents associated with the Bridges of Pittsburgh DH project at CMU
7 stars 2 forks source link

Learn difference between python osmnx package outputs and OSM shapefile data #27

Closed mdlincoln closed 5 years ago

mdlincoln commented 6 years ago

How does osmnx use/misuse OSM ids?

https://geoffboeing.com/2016/11/osmnx-python-street-networks/

mdlincoln commented 5 years ago

Documented in extensive Slack discussion, copied here:

Matthew Lincoln [9:49 AM]

alwaysreadthedocs so, the underlying OSM data.... does group physical bridges and give them identity :stuck_out_tongue: it does so essentially as a third layer of metadata called "Relations" that can contain both "Points" as well as "Ways" (multi-point edges btw - natively, these are not bezier curves or anything fancy like that, but just point series).

(further below the fold)

scottbot [9:50 AM] waitwaitwaitwait waitwait wait w Matthew Lincoln [9:50 AM] Had anyone worked with this metadata before? scottbot [9:50 AM] nope! Matthew Lincoln [9:50 AM] https://www.openstreetmap.org/relation/4246068#map=16/40.4496/-79.9940&layers=D OpenStreetMap Relation: ‪Veterans Bridge‬ (‪4246068‬) | OpenStreetMap OpenStreetMap is a map of the world, created by people like you and free to use under an open license. https://www.openstreetmap.org/assets/osm_logo_256-cde84d7490f0863c7a0b0d0a420834ebd467c1214318167d0f9a39f25a44d6bd.png https://wiki.openstreetmap.org/wiki/Relation hahaha, well, I'm glad I finally decided to give myself a few days to actually dig into the full data OSM gives :stuck_out_tongue: scottbot [9:51 AM] This is fascinating Matthew Lincoln [9:51 AM] I'm poking around to see if this is actually going to solve our bridge identity problem scottbot [9:51 AM] So what about tiny bridges with no official name, do they also get this grouping? Matthew Lincoln [9:51 AM] that's the next thing for me to check scottbot [9:51 AM] I suspect it would... Matthew Lincoln [9:52 AM] there are some bridges that comprise a single `Way`, without any parent `Relation` it is possible the python package yuyao was using masked this info, at least when transforming it into a classical network graph scottbot [9:54 AM] That sounds plausible https://wiki.openstreetmap.org/wiki/Tag%3Aman_made%3Dbridge Matthew Lincoln [9:55 AM] yeah, part of this search is figuring out what tag taxonomy to use Most of the example bridges I've found are either `Relations` with `type == bridge`, or `Ways` with `bridge == yes` scottbot [9:56 AM] Gotta go for a bit, meeting time But thanks, this is both frustrating and incredible Matthew Lincoln [9:56 AM] yep, i'll post some more notes up as I find em but first, I need to make fun of your whiteboard tweet Emma [9:56 AM] Sounds like an amazing option Matthew Lincoln [9:57 AM] the mere fact that there's hundreds of ways with `bridge == yes` but one with `bridge == viaduct` both warms and chills my data engineering heart Emma [9:57 AM] Hahhaha scottbot [9:58 AM] we have three options: TRUE, FALSE, and viaduct Matthew Lincoln [9:58 AM] making the tshirts now @Emma this is a new slogan for you to deploy at data day Matthew Lincoln [3:04 PM] ok, had to do some finagling so that it didn’t get hooked on some of the `Ways` that define non-road polygons such as bridge piers… but I think I can now move on to working on the rewiring problem now that we know which edge sets need to be collapsed fort_wayne.png Matthew Lincoln [3:12 PM] Going by the union set of `Relations` labeled as bridges as well as the `Ways` not in a larger bridge `Relation`, but which are themselves labeled as `bridge`, I get 1036 “bridges” (fwiw this includes a pretty generous bounding box, so I may be able to redo this depending on the precise export scottbot [3:25 PM] interesting - do you have any idea why this is greater than the 500-some lanes listed as `bridges`? Matthew Lincoln [3:28 PM] ah, well, one caveat to that number I’m just realizing some of the `Way`-only bridges - those without `Relation parents` - may still describe what we’d think of as one single bridge because they comprise a few sequential `Ways` as opposed to the parallel `Ways` problem we were originally dealing with but let me poke around further… scottbot [3:32 PM] Mkay, thanks for all this Matthew Lincoln [3:38 PM] also, do you know where this original ~500 “lanes” number came from? Because if I just search for `Ways` with the tag bridge, I get 1184 (again, give or take based on the bounding box) scottbot [3:44 PM] Well, I think it's important to restrict the network to only those streets with a terminus within the city, but my number comes from the graphml file I was fed - the number of edges labeled bridge. Matthew Lincoln [4:02 PM] well I’m not sure about those edges that Yuyao found, but there are still a few multi-Way bridges that aren’t united by a larger Relation, e.g. https://www.openstreetmap.org/way/55180497 OpenStreetMap Way: ‪Ridge Place‬ (‪55180497‬) | OpenStreetMap OpenStreetMap is a map of the world, created by people like you and free to use under an open license. https://www.openstreetmap.org/assets/osm_logo_256-cde84d7490f0863c7a0b0d0a420834ebd467c1214318167d0f9a39f25a44d6bd.png Matthew Lincoln [4:12 PM] so we might still need to use that pairwise distance heuristic to merge a few of these but the bridges with `Relations` then become a very useful test set… scottbot [8:49 AM] Why so? Matthew Lincoln [9:26 AM] Why do the bridges with `Relations` become a useful test set? Because if we do need to build some kind of classifier that says "these `Ways` are the same bridge but these aren't", it'll be very helpful to have a set of _known_ bridge Ways that belong together to evaluate different distance threshholds I'm still doing some filtering, though, that may make that work irrelevant. For example, a bunch of `Ways` labeled bridges are pedestrian skyways between private buildings like UPMC hospitals, and so they'd be unreachable from the rest of the graph anywho (edited) So the OSM tag system may still encode enough info to help us rule out a lot of those excess bridges (edited) scottbot [9:40 AM] No, I mean, given what you found, I'm not sure why you still need the pairwise fix Matthew Lincoln [9:46 AM] Because a few of the Ways that are marked as bridges but NOT grouped by Relations still look like physical bridges that we'd prefer _were_ joined as single bridges see the Ridge Place example I posted above Having browsed through a bit, it honestly may be faster to try running RPP or CPP and finding those problem bridges by eyeballing the results. or rather, I think it's worth forging ahead with rewiring the graph based on the existing Relations data, and seeing where we get scottbot [9:50 AM] That seems reasonable