Closed danrademacher closed 2 years ago
Tim did some initial investigation of this and pulled OSM and Microsoft buildings.
then the client just sent a list of the buildings missing outlines. There are 523 of them
All that is in this folder: https://drive.google.com/drive/folders/175VvCu3i6Vrp7ySnjGttTIEOauCq-bDp
cc @tsinn
This is an interesting comment from the client:
Most of these buildings seem to have a shape in the OSM data somewhere they just weren’t matched using the lat/long point to polygon match originally. Here’s an example in Carto
So this is a case where the actual data they already have in CARTO contains building outlines that never got tied together.
Maybe we should start with the points and polygons they already have and just see how many Stamen just missed.
The queries and comments in this file are important to review: https://github.com/GreenInfo-Network/seattle-building-dashboard/blob/master/docs/Building_outlines.md
Especially at the end:
Buildings that still don't have outlines fall into three types:
- Building point is in the wrong place and is too far from an outline to be found. For example, id=388 (Rainier Tower) seems to clearly be in the wrong place.
- Building point is in or near an outline already in use. For example, id=352 (Abraham Lincoln Building) overtaken by id=295 but they seem to be in the same building.
- Building point is accurate but building outline doesn't exist.
It looks like there are contradictions between the outlines data in CARTO, current OSM, and Microsoft.
As shown here where purple is CARTO current and brown is OSM:
OSM appears more accurate. We assume purple came form 2009 Seattle data. This points to more complex data QA than we expected. In that example, one of the two items has an OSM ID, and it was deleted 4 years ago: https://www.openstreetmap.org/way/137152318
Breaking this down:
Other considerations:
One question that came up for Terry is why we have 281K builing outlines, only 3600 of which have a buildingid
but then we test for that ID being null as a sign of no ID.
Looking closer at the query Terry suggested,
SELECT osebuildingid
,buildingname
,firstyearrequired
,taxparcelidentificationnumber
,fulladdress
,latitude
,longitude
,seattle_building_outlines_20181126.osm_id
,seattle_building_outlines_20181126.buildingid
FROM PUBLIC.table_2020_required_buildings
LEFT JOIN PUBLIC.seattle_building_outlines_20181126 ON table_2020_required_buildings.osebuildingid = PUBLIC.seattle_building_outlines_20181126.buildingid
WHERE osm_id IS NULL
ORDER BY osebuildingid
This query depends on finding matches between buildingid
and also that osm_id
is null as a sign that we have no polygon connection. In reviewing the data this morning, we noted that buildingid
is populated for only 1.3% of the building outline data. This was unexpected. Also, osm_id
is missing for about 75K buildings. since some buildings came from 2009 Seattle data, this might be expected, but it still makes us a bit confused about using these IDs to figure out missing polygons.
In any case, we're going to proceed with out Steps 1, 2, and 3 above. If we are able to get matches from the latest OSM and Microsoft data, then eventually we might just flush unconnected polygons out of CARTO so they aren't sitting around causing confusion in the future.
the total number of unmatched/missing buildings are 523, this first analysis was done against the OSM data and the city parcels downloaded from here
<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">
| missing buildings | type of match | review -- | -- | -- | -- 1 | 457 | matched with buildings | since there is a parcel id attached to the missing buildings data, to be sure the geocoded points are in the right place I'll crosscheck if the parcels they're in match those ids 2 | 52 | landed within a parcel but didn't match a specific building | most of the reviewed parcels do have buildings but in some there are more than one and in others the points landed on the open space missing the building. It could also be a case of missing building outline in OSM, for these there will be some manual moving of points involved. I'll use google street view to verify the address 3 | 14 | didn't match a building or a parcel | since all the addresses are not correct after geocoding, 1 landed in Nebraska(1959 NE PACIFIC ST, ,) and 2 out of city borders. There are 3 duplicates (have the same address, could be a multi floor building), 1 has just the street name which makes it hard to pair with a building but will use the parcel id to find were that building could be. The remaining 7 land on the streets, they'll need a hand review by address to pair with a building
We have handy instructions from Stamen on how we can pull outline-less buildings from CARTO: https://github.com/GreenInfo-Network/seattle-building-dashboard/blob/master/docs/Building_outlines.md
Though Terry is also planning to send us a list.
Then we can use this 2019-2020 vintage data from Microsoft to find outlines for the missing buildings.
If we still have missing outlines, we'll want to see how many and whether we manually review or take some other approach. Some would likely be due to wrong locations of latlng points, while others could siply be missing outlines in the MS data.
┆Issue is synchronized with this Asana task