Closed BethMattern closed 2 years ago
Just adding this as a note. In CalEnviroScreen methodology (and verbatim):
For each census tract, the data was analyzed to estimate the number of households with household incomes less than 80% of the county median and renter or homeowner costs that exceed 50% of household income. The percent of the total households in each tract that are both low-income and housing-burdened was then calculated.
Unlike the US Census, CHAS data are ACS estimates which come from a sample of the population and may be unreliable if they are based on a small sample or population size. The standard error (SE) and relative standard error (RSE) were used to evaluate the reliability of each estimate.
The SE was calculated for each census tract using the formula for approximating the SE of proportions provided by the ACS (American Community Survey Office, 2013, pg. 13, equation 4). When this approximation could not be used, the formula for approximating the SE of ratios (equation 3) was used instead.
The RSE is calculated by dividing a tract’s SE by its estimate of the percentage of housing burdened low income households, and taking the absolute value of the result.
Census tract estimates that met either of the following criteria were considered reliable and included in the analysis:
Census tracts with unreliable estimates receive no score for the indicator (null). The indicator is not factored into that tract’s overall CalEnviroScreen score.
Census tracts that met the inclusion criteria were ordered by percent housing burdened low-income households. The census tracts were assigned percentiles based on the distribution across all tracts.
This ticket is already completed, or at least a basic version is: see census_tracts_score_comparisons
in the comparison tool.
I'd propose we close this for now – please re-open if it feels premature.
@saran-ahluwalia let me know if I'm misunderstanding, but I'm not sure the comments you've pasted above are relevant to this topic? They're definitely relevant to conversations about handling null values, etc, but not this specific ticket.
@lucasmbrown-usds Yes, that was just a note for myself and posterity to outline how the CalEnvironScreen's methods for assigning tracts with null values. That was all. This would be a reference for future issues.
Problem statement/question Based on https://github.com/usds/justice40-tool/issues/245 and https://github.com/usds/justice40-tool/issues/135, it would be interesting to run a statistical analysis of the difference between census tracts included as priority communities in a comparison metric but not in the current CEJST score. E.g., "Census tracts that are included in CalEnviroScreen disadvantaged communities but are not included in the current CEJST priority communities have 20% higher incomes and 34% less linguistic isolation on average than CalEnviroScreen census tracts that are included in the current CEJST priority communities."