18F / crime-data-explorer

Moved to https://github.com/fbi-cde
73 stars 20 forks source link

Potential inaccuracies with participation report #302

Closed LarryBafundo closed 6 years ago

LarryBafundo commented 6 years ago

The following is from internal (CJIS) feedback:

"I was in CDE and downloaded the CSV for the Uniform Crime Reporting Program Participation Data. I sorted the data first by state and then by year, the first thing I noticed is that in 2015 Alabama has 52 NIBRS participating agencies, which is incorrect they only have 1. Alaska is not even NIBRS certified and they have several listed as participating. Arizona has 11 listed for 2015 and they only have maybe 5. Arkansas has 19 listed as NIBRS participating, which is incorrect the whole state is NIBRS (around 300 agencies). Someone may want to take a closer look at the table. The only thing I looked at was the NIBRS participation, so I don’t know if the other numbers are correct."

LarryBafundo commented 6 years ago

Lets spend some time looking into this, as it seems that we tend to have more reporting entities for the CDE than the UCR program expects. Assuming the data is correct, might this be a result of UCR referring to the number of reporting agencies in a state, versus the number of ORIs within a state? But even if that were true, how would it explain scenarios where we have agencies reporting data for states that don't even participate in NIBRS yet?

harrisj commented 6 years ago

Yep, I see it in the participation_rates table

state_name nibrs_participating_agencies
Alabama 52
Alaska 8
Arizona 11
Arkansas 19

Will have to see why...

harrisj commented 6 years ago

So, I'm not really finding any issues in the queries that are used to build things like the agency_participation table. Can you possibly try rerunning what is in the dba/after_load/participation_table.sql file and see if it persists? I'd do it myself, but I don't want to mess with the DB if you are doing some loading stuff.

harrisj commented 6 years ago

I guess I should clarify that the problem seems to be that nibrs_participated is being set to 1 for agencies which are covered by another agency and that agency is not actually in the nibrs_agency_reporting table. That said, I don't really see how this is happening because when I manually rerun the select that builds the agency_participation I don't see this behavior. Let me look for a little while longer.

harrisj commented 6 years ago

Yes, there seems to have been some sort of error in how agency_participation was generated, but I can't seem to replicate it.

Agency participation for agency_id=128 in Alabama

year nibrs_months_reported nibrs_participated covered
2,016 0 1 1
2,015 0 1 1
2,014 0 0 0
2,013 0 0 0
2,012 0 0 0

NIBRS participation should not be 1. But when I run the select query to make an agency_participation_temp table, I get this

year nibrs_months_reported nibrs_participated covered
2,016 0 0 1
2,015 0 0 1
2,014 0 0 0
2,013 0 0 0
2,012 0 0 0

So, I think you just need to rebuild the participation tables...

cacraig commented 6 years ago

I rebuilt this table on the production database. This can be closed @LarryBafundo

harrisj commented 6 years ago

So, I just ran the numbers for 2015

year state_name nibrs_participating_agencies
2,015 Alabama 0
2,015 Alaska 0
2,015 Arizona 0
2,015 Arkansas 0
2,015 California 0

Which looks a bit weird. But it seems fine in 2016. This might also be an issue where we count "participating" as 12 months of reporting (I seem to recall that Hoover, AL only did 10 months in 2015)

harrisj commented 6 years ago

Yes, this looks to be correct. For some reason, 2015 was a brutal year for NIBRS participation with only 4 agencies reporting 12 months of data (will file another issue for that). But here are the agencies in Arizona and Alabama for instance

state_name agency_ori agency_name nibrs_months_reported
Alabama AL0011200 Hoover 8
Arizona AZ0140000 Yuma 2
Arizona AZ0140500 Yuma 6
Arizona AZ0111300 Apache Junction 6
Arizona AZ0071100 Gilbert 5