Open caged opened 9 years ago
This only seems to be an issue with Portland proper addresses; and not all of them suffer from it.
Blue - All buildings Green - Buildings that have a state_id (tlid) match in addresses
After landing #5, specifically https://github.com/rosecitygis/osm-building-import/blob/master/sql/_normalize_state_id.sql, this is in much better shape. We now have 595,123 buildings matched with addresses. That's almost the entire dataset.
I'm going to leave this open because I'd love to get some guidance from some community OSM members and folks over at Metro on whether this should be considered resolved. I've done some spot checking and things appear to be ok.
Are you proposing this be the building_id or just want to retain it? I'd propose we use the building ID and use that to maintain links back to RLIS for future changes like building removal or additions. Using a minimum set of tags is the preference, I believe.
Are you proposing this be the building_id or just want to retain it?
There are a few different ids, so this can be a little hard to keep straight. There is an existing bldg_id
(which is actually composed of the state_id) [1], so that takes care of the building id.
state_id
(in buildings) and tlid
(in addresses) are used to match buildings and relevant addresses together. The process looks like this:
There are other issues to consider beyond this that I think we should tackle in a different issue. The case of multiple addresses, for instance.
In the RLIS addresses dataset, most Portland addresses are formatted with leading zeros and a dash. However, the RLIS building dataset discards all the leading zeros and the dash.
Example
Proposed solution Create a postgres function that normalizes state_id/tlid across buildings and addresses data. This function should be used when processing state_id/tlid for storage during
create
operations and should include the whitespace processing already done in #2. Specifically, all "finalized" tables should be processed in this manner.