Open Oren-T opened 1 month ago
There are some validations that need to be done on the current data:
Undercounting:
funding_office_name
called "COMMUNITY ORIENTED POLICING SERVICE," which seems like something that should be included in what we're doing? However, the program_activities_funding_this_award
column is blank in the example I have (award is available here). I found it by searching for "chp" in the description.
program_activities_funding_this_award
column is blank.Overcounting
Cities/States/Sub-Grants
Many state capitols get grants which are then allocated to individual counties or cities within the states. As a result, funding numbers for capital cities are likely over-counted, and for non-capitol cities, we are under-counting.
As a temporary fix, we are filtering the column called business_types_description
, to remove any grants which are awarded to "STATE GOVERNMENT"s. However, long-term, we seek to incorporate sub-grants as a more accurate way to process these data.
For instance, this VOCA grant to Chicago has 26 sub-awards which specify different cities around the state where the funds were sent. However, there are some small layers of added complexity to consider.
One goal for the data is to present cities with a number of the remaining available funds they have for CVI programs. This becomes difficult to impossible if we are looking at the sub-grant level, since some of the money won't yet have been awarded in sub-grants.
The action date for sub-grants should be greater than or equal to the primary grant date, but this is another thing that needs to be validated.
Once this is all accounted for, however, we should be able to merge together primary and sub-grant data to better understand proper numbers of funding at each city level, ideally resolving much of the over- and under-counting currently happening.
There are three "cities" listed in the American Violence data set which I am excluding due to them not having city-level alignment:
'Louisville/Jefferson County, Kentucky',
'Nashville-Davidson, Tennessee',
'Urban Honolulu CDP, Hawaii'
Baton rouge has some grants we exclude:
grants_to_exclude = [ 'https://www.usaspending.gov/award/ASST_NON_B-21-DF-22-0001_8620/', 'https://www.usaspending.gov/award/ASST_NON_B-18-DP-22-0001_8620/', 'https://www.usaspending.gov/award/ASST_NON_B-21-DZ-22-0001_8620/', 'https://www.usaspending.gov/award/ASST_NON_B-22-DF-22-0001_8620/', ]
There are two main data sources we're thinking about right now:
Bulk data downloads from USASpending.gov
OJP funding data
Gun violence data