chihacknight / decarbonize-my-state

What does it take to decarbonize your state?
MIT License
13 stars 1 forks source link

Get data on number of buildings in each State #21

Closed derekeder closed 2 years ago

derekeder commented 2 years ago

For the Building electrification part of the state detail page, we will need to get counts of the number of buildings in a given state.

Research where to find this data - bonus points for breaking the building counts up by residential and non-residential.

Data should be collected in a Google Sheet with one row for each State

derekeder commented 2 years ago

Possible data sources

nofurtherinformation commented 2 years ago

I took a look at OSM data and chatted with a friend about data sources here. The OSM data wasn't too bad to parse, but the numbers seemed overall quite low. Fortunately, Microsoft released a building footprints dataset that seems more reasonable, and is still OS.

For sanity, here's a screenie of Delaware, one of the places where the two diverge the most: image

nofurtherinformation commented 2 years ago

If that data seems reasonable to folks (I think OK on my end), I've got it output in an outer tagged JSON format like the other data so far, and also an OSM script for posterity.

derekeder commented 2 years ago

@nofurtherinformation nice work! The Microsoft data seems like a good find! From the Readme, it looks like they'd like to import it to OSM, but its a bit of a process.

A few follow-up questions:

nofurtherinformation commented 2 years ago

Thanks @derekeder! Hopefully in the future it gets integrated to OSM, but I agree for now Microsoft data seems to be the way.

For sanity checks, the correlation with OSM is already pretty good (~75% correlation, r2 of .56), and comparing against population is also straightforward. There are a bunch of metro counties Assessor data we can compare against (Cook, LA, NY, etc.), and Vermont publishes their State E911 footprint address data, so we can get some regional/rural confirmation too. I suspect Microsoft might over count slightly for things like sheds or accessory units. I'll do a quick sanity check on the counts and circle back here.

For building type, this is slightly trickier. The Microsoft data is just footprint geometries, but we could combine those with zoning data where available to make a decent guess! Places like Texas will be a challenge, since zoning doesn't strictly exist. Same issues exist for OSM coverage on land use/zoning, but we'd get enough buildings that I think it'd be reasonable representative :)

derekeder commented 2 years ago

@nofurtherinformation nice work! Let's roll with the Microsoft data then.

Combining with zoning is a good idea, but as you say, it would vary widely from state to state and as far as i know the data doesn't exist in one combined place like the US Buildings Footprints. It strikes me knowing the number of houses, apartments and other buildings in America is something that a lot of people would want/need to know. I wonder what existing research groups have already done estimates of this.

nofurtherinformation commented 2 years ago

Thanks @derekeder! I've got some time tonight to sanity check the Microsoft data, so it should be good to go for tomorrow evening's meetup.

On building types, I did a bit more digging and found this report from the National Renewable Energy Lab. This goes a bit further down the rabbit hole on modeling (eg. estimating wall types), but they do critically provide by-building metadata files with state tags and building typologies, for commercial these look like:

'SmallOffice', 'RetailStripmall', 'RetailStandalone', 'Warehouse', 'QuickServiceRestaurant', 'Outpatient', 'MediumOffice', 'FullServiceRestaurant', 'SecondarySchool', 'LargeHotel', 'PrimarySchool', 'Hospital', 'SmallHotel', 'LargeOffice'

These are based on 2018 building stocks, so reasonably recent enough. For future reference, here's a link to the ResStock (residential stock) metadata tsv and to the ComStock (commercial stock) metadata tsv. Even if the total building count is different, this is probably a reasonable sample to go on and use as a percentage split on the Microsoft footprints!

nofurtherinformation commented 2 years ago

Even more interesting, the NERL data gives us by building data on things like HVAC system types, energy consumption across a variety of metrics (heating and cooling, lighting) and seasonal estimates. Food for thought--here's a sample row of ComStock data:

bldg_id 105
applicability True
in.upgrade_name Baseline
in.tstat_clg_delta_f 5
in.tstat_clg_sp_f 77
in.tstat_htg_delta_f 8
in.tstat_htg_sp_f 63
in.aspect_ratio 2
in.county G0100890
in.building_type SmallOffice
in.rotation 270
in.number_of_stories 1
in.sqft 17500
in.hvac_system_type PSZ-AC with electric coil
in.weekday_operating_hours 8.5
in.weekday_opening_time 8
in.weekend_operating_hours 12.5
in.weekend_opening_time 11
in.energy_code_followed_during_last_exterior_lighting_replaceme ComStock DOE Ref 1980-2004
in.energy_code_followed_during_last_hvac_replacement ComStock DOE Ref 1980-2004
in.energy_code_followed_during_last_interior_equipment_replacem ComStock DOE Ref 1980-2004
in.energy_code_followed_during_last_interior_lighting_replaceme ComStock 90.1-2019
in.energy_code_followed_during_last_roof_replacement ComStock DOE Ref 1980-2004
in.energy_code_followed_during_last_service_water_heating_repla ComStock DOE Ref 1980-2004
in.energy_code_followed_during_last_walls_replacement ComStock DOE Ref 1980-2004
in.energy_code_followed_during_last_windows_replacement ComStock 90.1-2007
in.energy_code_followed_during_original_building_construction ComStock DOE Ref 1980-2004
in.heating_fuel Electricity
in.number_stories 1
in.service_water_heating_fuel Electricity
stat.air_system_fan_total_efficiency 0
stat.average_boiler_efficiency 0
stat.average_dx_cooling_cop 3.46235472864205
stat.average_dx_heating_cop 0
stat.average_gas_coil_efficiency 0
stat.design_dx_cooling_cop 3.07743026390591
stat.design_dx_heating_cop 0
stat.occupant_density_ppl_per_m_2 0.053819552083549
qoi_report.maximum_daily_timing_shoulder_hour 9.0873786407767
qoi_report.maximum_daily_timing_summer_hour 9.84496124031008
qoi_report.maximum_daily_timing_winter_hour 9.45112781954887
qoi_report.maximum_daily_use_shoulder_kw 23.16081538288
qoi_report.maximum_daily_use_summer_kw 29.7602716030508
qoi_report.maximum_daily_use_winter_kw 29.5302216031377
qoi_report.minimum_daily_use_shoulder_kw 11.7128145227106
qoi_report.minimum_daily_use_summer_kw 11.6088448793368
qoi_report.minimum_daily_use_winter_kw 12.7324554597747
in.nhgis_tract_gisjoin G0100890003000
in.nhgis_county_gisjoin G0100890
in.state_name Alabama
in.state_abbreviation AL
in.census_division_name East South Central
in.census_region_name South
in.weather_file_2018 USA_AL_Huntsville.Madison.723230_2018.epw
in.weather_file_TMY3 Huntsville_Intl_Jones_Field
in.climate_zone_building_america Mixed-Humid
in.climate_zone_ashrae_2004 3A
in.iso_region None
in.reeds_balancing_area 89
in.resstock_county_id AL, Madison County
in.nhgis_puma_gisjoin G01000302
out.district_cooling.cooling.energy_consumption 0
out.district_cooling.cooling.energy_consumption_intensity 0
out.district_heating.heating.energy_consumption 0
out.district_heating.heating.energy_consumption_intensity 0
out.district_heating.water_systems.energy_consumption 0
out.district_heating.water_systems.energy_consumption_intensity 0
out.electricity.cooling.energy_consumption 7383.33333333333
out.electricity.cooling.energy_consumption_intensity 0.421904761904762
out.electricity.exterior_lighting.energy_consumption 18058.3333333333
out.electricity.exterior_lighting.energy_consumption_intensity 1.03190476190476 10025 0.572857142857143
out.electricity.heat_recovery.energy_consumption 0
out.electricity.heat_recovery.energy_consumption_intensity 0
out.electricity.heat_rejection.energy_consumption 0
out.electricity.heat_rejection.energy_consumption_intensity 0
out.electricity.heating.energy_consumption 7308.33333333333
out.electricity.heating.energy_consumption_intensity 0.417619047619048
out.electricity.interior_equipment.energy_consumption 94127.7777777778
out.electricity.interior_equipment.energy_consumption_intensity 5.37873015873016
out.electricity.interior_lighting.energy_consumption 14847.2222222222
out.electricity.interior_lighting.energy_consumption_intensity 0.848412698412698
out.electricity.pumps.energy_consumption 2.77777777777778
out.electricity.pumps.energy_consumption_intensity 0.00015873015873
out.electricity.refrigeration.energy_consumption 0
out.electricity.refrigeration.energy_consumption_intensity 0
out.electricity.water_systems.energy_consumption 5552.77777777778
out.electricity.water_systems.energy_consumption_intensity 0.317301587301587
out.natural_gas.heating.energy_consumption 0
out.natural_gas.heating.energy_consumption_intensity 0
out.natural_gas.interior_equipment.energy_consumption 0
out.natural_gas.interior_equipment.energy_consumption_intensity 0
out.natural_gas.water_systems.energy_consumption 0
out.natural_gas.water_systems.energy_consumption_intensity 0
out.other_fuel.heating.energy_consumption 0
out.other_fuel.heating.energy_consumption_intensity 0
out.other_fuel.water_systems.energy_consumption 0
out.other_fuel.water_systems.energy_consumption_intensity 0 0 0 0 0 157305.555555556 8.98888888888889 157305.555463116 8.9888888836066 0 0 0 0
upgrade 0
weight 7.04174071967379
metadata_index 0
derekeder commented 2 years ago

@nofurtherinformation whoa that is some crazy detail! I'd be curious to see what the coverage is for the in.heating_fuel and in.service_water_heating_fuel attributes and what the distribution of values are. if we have good coverage, we could get very precise on how many buildings still need to be electrified!

nofurtherinformation commented 2 years ago

Heya @derekeder, agreed, really detailed! I pulled some summary data by county (we can shift up to state, but may be useful to explore and get a sense of the data). I plopped together a quick explorer with four pages:

Check it out here -- note the color bins on the maps are not fixed between residential and commercial, but this will help to get a sense of the distribution.

For sanity checking data, here's what we've got for counts of buildings via government footprint data vs Microsoft data:

Place Gov Microsoft
VERMONT 419,331 351,266
CHICAGO 820,606 907,967
LA 1,122,422 1,33,9971

Taking these 3 cases it's not perfect, but within around 10% or so. I'd be inclined to suggest this is good enough if you feel comfortable, given uncertain on both Microsoft data and open government data, but we can also pull some other county locations to confirm assumptions here.

derekeder commented 2 years ago

@nofurtherinformation thanks for this. This is really great! I agree its close enough that we should proceed with it. do you want to take on getting a CSV of the data rolled up by state with the relevant columns of data?

nofurtherinformation commented 2 years ago

Definitely! Just filed a PR that compiles all this. For convenience, here's a link to the google sheet and a direct CSV link!