GreenInfo-Network / seattle-building-dashboard

Energy benchmarking for Seattle
https://greeninfo-network.github.io/seattle-building-dashboard/
ISC License
1 stars 0 forks source link

Buildings with unique IDs sometimes have identical locations #101

Closed tomay closed 3 weeks ago

tomay commented 3 weeks ago

the “the_geom” field is showing duplicate values across multiple OSEIDs. In other words, I’m seeing different properties, within the same calendar year that have the same “the_geom” field.

In the data, it does seem like there are about 150 buildings that fit this query:

select * from (SELECT the_geom,count(distinct(id)) as id_count FROM public.seattle_buildings_2022_update_mr_20240425_v21 group by the_geom order by count(distinct(id)) DESC) as id_check where id_count > 1

Where the_geom has multiple building IDs associated with it

┆Issue is synchronized with this Asana task

tomay commented 3 weeks ago

Reviewing some of the buildings in that query, here's what I found:

tomay commented 3 weeks ago

if the_geom is generated based upon lat/lon, then it would make sense that there would be duplicate the_geom values. Properties with matching addresses and matching lat/lon also makes sense, because the lat/lon are pulled from the city’s Geodatabase based on addresses.

I’m not super-concerned about the_geom ids overlapping. I think this is bound to happen because address assignment for benchmarking is a somewhat imprecise science.