Closed meiqimichelle closed 8 years ago
So, in the latest revenue data from ONRR, there is no county-level data for Coal. I ran some command line tests just to be sure. First, I looked at all of the commodity values for rows with FIPS codes:
% csvfilter -d excel-tab --filter 'FIPS' data/_input/onrr/county-revenues.tsv | csvstat -c Commodity --tab
7. Commodity
<type 'unicode'>
Nulls: False
Values: Oil (bbl), Oil Shale, Oil & Gas, Gas (mcf), NGL (gal)
Row count: 18824
In other words: 18,824 of the 20,561 rows have FIPS codes, but all of them are in the Oil & Gas bucket. The inverse holds true, too:
% csvfilter -d excel-tab --filter 'not FIPS' data/_input/onrr/county-revenues.tsv | csvstat -c Commodity --tab
7. Commodity
<type 'unicode'>
Nulls: False
Unique values: 20
5 most frequent values:
Other Products: 608
Coal: 248
Geothermal: 194
Hardrock: 168
Coal (ton): 152
Max length: 14
Row count: 1738
I ran csvstat -c Commodity
on the full data set just to confirm, and there are 25 unique commodity values: 5 for Oil & Gas, and 20 others.
@mentastc does this sound right? Is our data wrong, or did I just not get the latest?
@shawnbot The latest download we have is here or on our downloads page (Fed revenue by location, onshore, CY). Does that match the data that you're parsing? I see FIPS codes in that Excel for more than just oil and gas.
Moving this back into in-progress so I can review the data.
This is indeed a data problem. Our county revenue data was out of date, but FWIW, it was the original ONRR data that lacked the FIPS and not a pipeline problem.
When you zoom in to state/county level on all commodity types except oil/gas, county data isn't showing up in the bar charts or the map for everything but oil/gas
See this example here