Closed danrademacher closed 1 year ago
Hi Dan,
I was able to review the first 12 entries in this spreadsheet and I made some notes regarding what I am seeing. In short, so far I can find the necessary building footprints in the 2015 building outline data from Seattle. What is occasionally odd is that a few of the entries have unclear directions around the action that needs to be taken. I've put notes here, in the last tab. https://docs.google.com/spreadsheets/d/1DAyLq1XWCSxC5smWZMkNtcv0byxVxTPG/edit#gid=134238005.
Type of issues I am seeing so far:
I pulled the data down from CARTO and saved here:
P:\proj_p_s\Seattle Building Dashboard\2023 update
It's a single SHP with only 2 columns:
For reimport into CARTO, we would drop cartodb_id
, which it assigns on import, and all we need is the footprint and buildingid
, which matches the ID in the tabular energy data.
One thing we need to confirm with @tomay is how he wants multi-building records to be stored:
buildingid
values across multiple rowsbuildingid
per record with a multipolygon of all buildings under that record, like for campuses.I think he wants the second version, but if the first is easier to make, that might be fine too.
Also, if we wanted to include an additional column, like seattle_opendata_id
for the reference to the city's building outlines file, that might be helpful for future updates.
Noting here we have a link to an 0828 version of this on Sharepoint
Probably that should replace the one on Drive.
Some building outlines need to be digitized - what building ID do we assign to those?
Next steps:
Schema notes:
buildingid
is the ID from the energy reporting dashboard, not the open data IDopendata_id
field with that valuesource
-- "Seattle Open Data" where appropriate or "Digitized"Here is the update on my progress:
As of right now, this spreadsheet: Master_DataUpdates carries a master summary list of all the updates deemed appropriate based on my review of the three 2021 Open Data Table data tracking spreadsheets (_20230801, _20230815, _20230828), that are also found on the Seattle Buildings Sharepoint.
In the above Master_DataUpdates sheet, we have two tabs - one for updates to Attributes and one for updates to Geometry. Aside from 5 currently unresolved cases, all geometry updates have been made. Tom said he can wait until we resolve those 5 cases, before we make the updates to attributes as well.
The updated shapefile is stored here: P:\proj_p_s\Seattle Building Dashboard\2023 update
, the shapefile name is seattle_building_outlines_2023_updated.shp
. It contains 4 attribute fields: cartodb_id
, buildingid
, opendataid
, and source
. It is NOT a multipolygon shapefile, meaning that footprints under the same buildingid are all separate polygons. I can change the script to dissolve based on buildingid and create multipolygons, if that is preferable.
I have not dropped cartodb_id
, but can do so. Also, opendataid
and source
are only populated for the records that we updated, not for all records, otherwise their value is Null. Not sure if we want to do it differently, or perhaps add a column of "last updated" with a year value or "unknown" for those we current don't know.
Mike uploaded a new table for review 2021_Open_Data_Table_20230922. I need to review those entries, and update the shapefile above.
Mike uploaded the last new version of this table, with 36 new building footprint edits. That's on Sharepoint. I also saved a copy to Drive here.
Any new ones include the string - MR
in the "Data Issue" column.
Only four of these new changes have Seattle Open Data IDs. It looks to me like footprint digitizing work that @joseph-stout could pick up as he did last time.
Happy to start on this tomorrow afternoon. Will I need to generate new Seattle Open Data ID's somehow, where they are absent?
No, my assumption is that those buildings just need to be digitized from imagery because they are not in Seattle Open Data
We got a copy of the building spreadsheet: https://docs.google.com/spreadsheets/d/1DAyLq1XWCSxC5smWZMkNtcv0byxVxTPG/edit?usp=drive_link&ouid=101583646446838428713&rtpof=true&sd=true
Here's the 2015 building outline data from Seattle: https://data.seattle.gov/dataset/Building-Outline-2015/4erg-k47y
In theory, we should be able to use Columns A (OSE ID) and D (Footprint) in the XLSX to get the vast majority of building outlines.
I am not sure about the next step from there -- replacing bad outlines in CARTO with new ones, but it should be tractable with some combination of Postgres queries, or even pulling down the whole building footprint dataset, fixing it, and replacing what's on CARTO.
For how, let's focus on making sure the IDs in A and D plus the open data footprints really will get us what we need.