danielolsen commented 3 years ago

:rocket:

[x] Is your feature request essential for your project?

Describe the workflow you want to enable

I wish there was a function that populated the Pd field for the bus table for the HIFLD grid.

Describe your proposed implementation

The new function should either live in prereise.gather.griddata.hifld.data_process.transmission or in a new module in a similar location (prereise.gather.griddata.hifld.data_process.demand?). One route could be demand proportional to population, which could be obtained from the census or similar sources.

danielolsen commented 3 years ago

The Census Bureau has released some information on the 2020 census (Redistricting Data (PL 94-171)), but does not appear to have released the 'full' data yet (i.e. what was called Summary File 1 in the 2010 census). It seems like more detailed data will come out sometime in 2022, according to their blog post from last week: https://www.census.gov/newsroom/blogs/random-samplings/2021/09/upcoming-2020-census-data-products.html

Digging into trying to query population data directly from the U.S. Census Bureau, I've found several leads on getting to ZIP-code level population data, but no slam-dumks yet:

The Census Bureau has an online data explorer tool which can be used to download population by census block. If we want to aggregate by ZIP code Tabulation Area (ZCTA, the census's analog to the USPS's ZIP codes), we may be able to by aggregating census block data to census tract, mapping 2020 census blocks to 2010 census blocks (https://www.census.gov/programs-surveys/geography/technical-documentation/records-layout/2020-census-block-record-layout.html), and then mapping 2010 census tracts to 2010 ZCTAs (http://www2.census.gov/geo/docs/maps-data/data/rel/zcta_tract_rel_10.txt). Unfortunately, 2020 does not seem to have any geographical mappings available yet (see https://www.census.gov/geographies/reference-files/time-series/geo/relationship-files.2020.html).
The Census Bureau releases 'Legacy Format Summary Files', which appear to be CSV-like text files, although how we can interpret these files is not immediately apparent. For example, the Alabama file has >255k rows, but the online data explorer tool only gives data for <186k census blocks in Alabama, and the first row seems to contain the population for the entire state.
The Census Bureau has an API: https://api.census.gov/data.html, which has the 2020 redistricting data available. There's a python package aimed at making these APIs easier to use (https://github.com/jtleider/censusdata), but it's not currently set up to read Redistricting data, although we may be able to extend it by putting a json file in the right location (we can get the json file from the Census Bureau API documentation).

As a reminder, simplemaps.com also provides population estimates per-ZIP and per-county, but the process by which they got to these values is a bit opaque.

EDIT: Census redistricting data from 2020 can give us populations by county directly, but ideally we would like something more granular, since counties are often large areas with varying population density, and distributing the demand naively across all substations in a county will probably produce bad results.

danielolsen commented 3 years ago

The 2019 American Community Survey can give us population by PUMA, which would still be ~15-20x larger than ZIP code on average (about 2,400 PUMAs vs. about 42,000 ZIP codes), and we would need to use the lat/lon of the substations to deduce PUMA rather that taking the ZIP code directly.

It seems that at least internally, the Census Bureau has a mapping of 2020 Census blocks to 2020 ZCTAs (see https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.2020.html), and we may be able to use geopandas or some similar tool to be able to deduce this mapping and therefore sum the available 2020 block data to ZCTAs, but it would be really great if we didn't have to do that manually.

danielolsen commented 2 years ago

Closed by #235.

Breakthrough-Energy / PreREISE

Build distribution of demand to HIFLD buses #229

:rocket:

Describe the workflow you want to enable

Describe your proposed implementation