Demographic Analysis - Githubissues

AmandaGIN commented 6 years ago

Based off of notes at: https://docs.google.com/document/d/1IeokJ4ESYAG4ADN4ikZM-dX8Mf56gKFyMX_6HTWpHKk/edit 1) Remove block-groups with 0 population, 72 block-groups removed 2) Erase water (using TIGER 2016 Areal_Hydrology layer) N:\USA\Census\Tiger2016\tlgdb_2016_a_us_areawater.gdb

AmandaGIN commented 6 years ago

Generate 3 tract files: 1) Tract count: age, race, hh 2) Tract count: poverty, poverty185% (there are 43 null block-groups to remove) 3) Tract value: mhhi (remove null block-groups)

AmandaGIN commented 6 years ago

Because of the different demographic variable availability the Service Area to Block-Group analysis will need to be run three times: Block-group counts: population, age, race, household. All other attribute columns removed. Feature Dataset: Processing_BlockGroups Feature Class: C_BlkGrps_Populated_MinusWater_AgeRaceHH_counts

Block-group counts: poverty hh, poverty individual, 185% poverty. All other attribute columns removed. Feature Dataset: Processing_BlockGroups Feature Class: D_BlkGrps_Populated_MinusWater_PovertyIndHH185_counts

Block-group values: median household income. All other attribute columns removed. Feature Dataset: Processing_BlockGroups Feature Class: E_BlkGrps_Populated_MinusWater_MHHI_value

Counts will be analyzed to assume equal distribution of the count across the block-group area, and how much of that respective block-group is covered by a service area. Value (MHHI only) will be calculated as a population weighted average.

AmandaGIN commented 6 years ago

Retailer preparation: BOE stores buffered by a half-mile with calculated area of buffer. Each buffer is 502.574 acres per field Acres_ServiceArea Database: P:\proj_p_s\Stanford_PRC\BOE_2017\data\DemographicAnalysis\BOE2017_DemographicAnalysis.gdb Feature Dataset: Processing_Retailers Feature Class: A_BOE2017_EUCHalfMile

Using the demographic layer for Counts of Age, Race, HH (C_BlkGrps_Populated_MinusWater_AgeRaceHH_counts) clip the service areas to their extent. This removes any part of a service area we do not have age, race, or household data in. Calculate the new area of buffer. Field AcresServiceArea_AgeRacHHcounts. Note that some stores are extremely small.
Removed entirely = 25 stores Under 0.5 acre = 2 stores 0.5 - 1 acre = 2 stores 1-5 acres = 3 stores 20-30 acres = 10 stores 40-100 acres = 6 stores 100.1 - 200 = 52 stores 200.1 - 300 = 203 stores 300.1 - 400 = 594 stores

- 502.574 = 32,147 stores The 25 entirely removed were flagged in the A_BOE2017_EUCHalfMile feature class, field: NoDemogOverlap, “No ACS2016 5-yr demog data available for service area” Those with very small areas (under 1 acre) were skimmed. They appear to be areas near airports or other non-residential lands that slightly overlap with a populated block-group. Database: P:\proj_p_s\Stanford_PRC\BOE_2017\data\DemographicAnalysis\BOE2017_DemographicAnalysis.gdb Feature Dataset: Processing_Retailers Feature Class: B_BOE2017_EUCHalfMile_CLIP_Counts_AgeRaceHH

Using the demographic layer for Counts of Poverty Ind, Poverty HH, Poverty 185% (D_BlkGrps_Populated_MinusWater_PovertyIndHH185_counts) clip the service areas (A_BOE2017_EUCHalfMile) to their extent. This removes any part of a service area we do not have poverty data in. Calculate the new area of buffer. Field AcresServiceArea_Poverty Database: P:\proj_p_s\Stanford_PRC\BOE_2017\data\DemographicAnalysis\BOE2017_DemographicAnalysis.gdb Feature Dataset: Processing_Retailers Feature Class: C_BOE2017_EUCHalfMile_CLIP_Counts_PovertyIndHH185

*The results appear to be the same for both of the store files related to count variables B_BOE2017_EUCHalfMile_CLIP_Counts_AgeRaceHH C_BOE2017_EUCHalfMile_CLIP_Counts_PovertyIndHH185 This may not always be the case, so the files have been run independently.

Using the demographic layer for Value - MHHI (E_BlkGrps_Populated_MinusWater_MHHI_value) clip the service areas (A_BOE2017_EUCHalfMile) to their extent. This removes any part of a service area we do not have median household income data in. Calculate the new area of buffer. Field AcresServiceArea_MHHI Note that some stores are extremely small.
Removed entirely = 70 stores Under 0.5 acre = 11 stores 0.5 - 1 acre = 3 stores 1-5 acres = 5 stores 5-20 acres = 6 stores 20.1-30 acres = 3 stores 30.1-100 acres = 21 stores 100.1 - 200 = 126 stores 200.1 - 300 = 432 stores 300.1 - 400 = 1,444 stores

- 502.574 = 30,923 stores The 70 entirely removed were flagged in the A_BOE2017_EUCHalfMile feature class, field: NoDemogOverlap, “No ACS2016 5-yr demog value data available for service area” Those with very small areas (under 1 acre) were skimmed. They Database: P:\proj_p_s\Stanford_PRC\BOE_2017\data\DemographicAnalysis\BOE2017_DemographicAnalysis.gdb Feature Dataset: Processing_Retailers Feature Class: D_BOE2017_EUCHalfMile_CLIP_Value_MHHI

AmandaGIN commented 6 years ago

Intersect Demog and Service Areas (3 times): AgeRaceHH - intersect: B_BOE2017_EUCHalfMile_CLIP_Counts_AgeRaceHH C_BlkGrps_Populated_MinusWater_AgeRaceHH_counts Results: Feature Dataset: Processing_RetailerBlkGrp_Intersects Feature Class: A_INTERSECT_AgeRAceHH_Counts

Poverty Ind, HH, 185% - intersect: C_BOE2017_EUCHalfMile_CLIP_Counts_PovertyIndHH185 D_BlkGrps_Populated_MinusWater_PovertyIndHH185_counts Results: Feature Dataset: Processing_RetailerBlkGrp_Intersects Feature Class: B_INTERSECT_PovertyIndHH185_Counts

MHHI - intersect: B_BOE2017_EUCHalfMile_CLIP_Value_MHHI C_BlkGrps_Populated_MinusWater_MHHI_value Results: Feature Dataset: Processing_RetailerBlkGrp_Intersects Feature Class: C_INTERSECT_MHHI_Value

Repair geometry on all 3 results (remove self intersections) and compact database P:\proj_p_s\Stanford_PRC\BOE_2017\data\DemographicAnalysis\RepairGeometry_ServiceAreaBlockGroup_Intersects

AmandaGIN commented 6 years ago

Determine Proportion of Attribute to assign We assume equal distribution of a count variable across the block-group area. To do this we: determine the % of the block-group inside the service area. Apply that percent to each count attribute.

Add acres of the block-group remaining after the intersect (acres inside the service area) [Acres_BlkGrp_InServiceArea] Calculate the percent that is of the full block-group area. [BlkGrp_PctInServiceArea] = [Acres_BlkGrp_InServiceArea] / [Acres_BlkGrp_NoWater] Multiple each count variable by [BlkGrp_PctInServiceArea] [TotPopXPctBlkGrp] = [BlkGrp_PctInServiceArea] TotPop [Age0_17XPctBlkGrp] = [BlkGrp_PctInServiceArea] TotPop [Age5_17XPctBlkGrp] = [BlkGrp_PctInServiceArea] TotPop [Age18_20XPctBlkGrp] = [BlkGrp_PctInServiceArea] TotPop [Age21_24XPctBlkGrp] = [BlkGrp_PctInServiceArea] TotPop [NHLAfAmerXPctBlkGrp] = [BlkGrp_PctInServiceArea] TotPop [NHLAsianXPctBlkGrp] = [BlkGrp_PctInServiceArea] TotPop [NHL_PIXPctBlkGrp] = [BlkGrp_PctInServiceArea] TotPop [NHLWhiteXPctBlkGrp] = [BlkGrp_PctInServiceArea] TotPop [NHLAIndXPctBlkGrp] = [BlkGrp_PctInServiceArea] TotPop [NHLOtherXPctBlkGrp] = [BlkGrp_PctInServiceArea] TotPop [NHL2plusXPctBlkGrp] = [BlkGrp_PctInServiceArea] TotPop [HisLatXPctBlkGrp] = [BlkGrp_PctInServiceArea] TotPop [HHcntXPctBlkGrp] = [BlkGrp_PctInServiceArea] TotPop

The next set of data (poverty) is not available for all of the block-groups, for that reason the B_INTERSECT_PovertyIndHH185_Counts file was created. The same area calculations [Acres_BlkGrp_InServiceArea] are performed, % calculated [BlkGrp_PctInServiceArea], and applied to the poverty variables:

[BlPovHHXPctBlkGrp] = [BlkGrp_PctInServiceArea] [Bl_Pov_HH] [BlPovIndPctBlkGrp] = [BlkGrp_PctInServiceArea] [Bl_Pov_Ind] [PovHHPollPctBlkGrp] = [BlkGrp_PctInServiceArea] [PovHHPoll] [PovIndPollPctBlkGr] = [BlkGrp_PctInServiceArea] [PovIndPoll] [Est185PovPctBlkGrp] = [BlkGrp_PctInServiceArea] * [Est185PovertyBlockGroup]

The ‘Poll” field for households and Individual are the total number of people asked about poverty (or that it was determined for) - this is the value to use as the denominator in percentage calculations.

The last set of data (median household income) to process is a value (median) as opposed to a count (sum). To determine the service area MHHI we perform a weighted average. The weighting is by population. Using the intersect file C_INTERSECT_MHHI_Value calculate the area of block-group in service area [Acres_BlkGrp_InServiceArea] Calculate the percent it is of the block [PctPieceIsOfBlkGrp] = [Acres_BlkGrp_InServiceArea] / [Acres_BlkGrp_NoWater] Apply the [PctPieceIsOfBlkGrp] to population [PopXPctOfBlkGrp] = [PctPieceIsOfBlkGrp] TotPop Summarize on the column of the store license number [LicenseNum], adding the [PopXPctOfBlkGrp]. This is the total population of the service area. It is calculated again, because the intersection of the counts (pop, age, race, hh) has data in more areas. File: C_Intersect_SUM_Population_Count Join the results C_Intersect_SUM_Population_Count back to the intersected file C_INTERSECT_MHHI_Value.shp Copy over (to a new field [ServiceAreaPOP]) the summarized population of each licensed retailer Calculate the percent of the population each block-group is of the service area population [Pop_Pct_ofServiceArea] = [PopXPctOfBlkGrp]/[ServiceAreaPOP] Multiply the results [Pop_Pct_ofServiceArea] [MHHI] = [MHHIxPopPctofSA] Again, summarize on License number [LicenseNum] but this time add the [MHHIxPopPctofSA] that is the population weighted average of the MHHI values in the service area.

AmandaGIN commented 6 years ago

sent to client: https://www.dropbox.com/s/wtj2nygerhdtqaa/BOE2017_Store_HalfMileDemographics.xlsx?dl=0

P:\proj_p_s\Stanford_PRC\BOE_2017\data\DemographicAnalysis\BOE2017_Store_HalfMileDemographics.xlsx

GreenInfo-Network / Stanford-BOE2017

Demographic Analysis #14