Analyses on final dataframe

Summary of aggregations for @TrentonBush :

This analysis takes the output dataframe of qualifying areas and aggregates it such that there is a record for each county that has a qualifying criteria within it. The input to this aggregation is a dataframe of all qualifying records, where there is a record for each Census tract that qualifies via the brownfields or coal criteria, and a record for each county that qualifies via the employment criteria. This input dataframe is the result of energy_comms.coordinate.get_all_qualifying_areas(). The output of this aggregation is a dataframe with a record for each county that has some qualifying area within it. Included are the following columns:

county_id_fips (the index), county_name: The FIPS code and name of the county.
state_name: The name of the state that the county is a part of.
num_brownfields: The number of brownfields in a county.
num_coal_qualifying_tracts: The number of census tracts within that county that have a closed coal plant or mine, or are adjacent to a closed coal plant or mine.
percent_of_county_coal_qualified: The percentage of the total area of the county that is qualified via the coal closure criteria.
qualifies_by_employment_criteria: Whether the county qualifies by the employment criteria, meaning it is part of an MSA that meets the criteria for fossil employment and national unemployment.

Brownfields Aggregation

The input dataframe is grouped by county FIPS code, and the number of brownfield records within that county are summed to get the total number of brownfields in the county.

Coal Aggregation

The input dataframe is grouped by county FIPS code, and the number of coal qualifying Census tracts within that county are summed to get the total number of coal qualifying tracts in the county.

To get the percentage of area that qualifies, first the area for each qualifying Census tract is calculated using geopandas and the Shapefile coordinates for each tract given by the Census DP1 geodatabase. Then the total area for each county is calculated. Finally, the input dataframe is grouped by county FIPS code, and the area of qualifying tracts within each county is summed, and then divided by the total area of that county to get the percentage of area that qualifies within each county.

Employment Criteria Aggregation

The input dataframe already includes a record for each county that qualifies via the employment criteria. These employment qualifying counties are merged onto the output dataframe to create a boolean column identifying whether a county qualifies via the employment criteria.

The output dataframe contains a record for each county with an energy communities qualifying area within it (or the entirety of the county qualifies) and the above columns.

catalyst-cooperative / rmi-energy-communities

Analyses on final dataframe #88