nprapps / leso

Processing scripts for Defense Logistics Agency LESO data
http://blog.apps.npr.org/2014/09/02/reusable-data-processing.html
MIT License
8 stars 2 forks source link

Errors in the fips_crosswalk.csv? #21

Open dannguyen opened 9 years ago

dannguyen commented 9 years ago

When re-doing the guns per capita calculation, I was running into issues with Starr County, TX, which is listed as getting 12.7 guns per 1,000 capita in this graphic

When looking through the source files, I noticed what appears to be an error in fips_crosswalk.csv

| County | State | Zip | | STARR | TX | 48247 |

However, on the Census site, the FIPS for Starr County is listed as 48427.

The 48247 code applies to Jim Hogg County, which has a population of around 5,300 compared to Starr County's 60,000

I didn't go through the trouble of cross-referencing the codes in the fips_crosswalk against the official FIPS list (one compilation is here, at their QuickFacts endpoint), so I don't know if that's the only mixup or not.

In the fips_crosswalk.csv file, STARR, TX is actually listed for both 48427 and 48247...So I don't know if the 12.7 guns per capita in the story is actually wrong (it would depend on which of the two census rows you used). However, in the LESO data - all states.csv that you have currently published, you have Starr county assigned to both FIPS

eads commented 9 years ago

Thanks Dan, I'll look into this. But it sure looks like an error in our FIPS crosswalk, which we got from an NPR reporter. I'd love to find a better one if you know of any...

dannguyen commented 9 years ago

You can pull a cross-reference from the Census site:

http://quickfacts.census.gov/qfd/download_data.html (specifically, the FIPS_CountyName.txt)

Here's a regex-cleaned up one for your convenience (though in all caps) https://github.com/datajanitor/diaries/blob/master/leso_and_census/data-cleaned/FIPS_CountyName.txt