DOI-ONRR / doi-extractives-data

Information on the extractive industries in the U.S. from federal data.
https://revenuedata.doi.gov/
Other
77 stars 41 forks source link

Explore > Federal revenue by location: state/county data is not displaying #1049

Closed meiqimichelle closed 8 years ago

meiqimichelle commented 8 years ago

When you zoom in to state/county level on all commodity types except oil/gas, county data isn't showing up in the bar charts or the map for everything but oil/gas

See this example here screenshot 2015-12-10 09 34 15

shawnbot commented 8 years ago

So, in the latest revenue data from ONRR, there is no county-level data for Coal. I ran some command line tests just to be sure. First, I looked at all of the commodity values for rows with FIPS codes:

% csvfilter -d excel-tab --filter 'FIPS' data/_input/onrr/county-revenues.tsv | csvstat -c Commodity --tab    
  7. Commodity
        <type 'unicode'>
        Nulls: False
        Values: Oil (bbl), Oil Shale, Oil & Gas, Gas (mcf), NGL (gal)

Row count: 18824

In other words: 18,824 of the 20,561 rows have FIPS codes, but all of them are in the Oil & Gas bucket. The inverse holds true, too:

% csvfilter -d excel-tab --filter 'not FIPS' data/_input/onrr/county-revenues.tsv | csvstat -c Commodity --tab
  7. Commodity
        <type 'unicode'>
        Nulls: False
        Unique values: 20
        5 most frequent values:
                Other Products: 608
                Coal:   248
                Geothermal:     194
                Hardrock:       168
                Coal (ton):     152
        Max length: 14

Row count: 1738

I ran csvstat -c Commodity on the full data set just to confirm, and there are 25 unique commodity values: 5 for Oil & Gas, and 20 others.

@mentastc does this sound right? Is our data wrong, or did I just not get the latest?

meiqimichelle commented 8 years ago

@shawnbot The latest download we have is here or on our downloads page (Fed revenue by location, onshore, CY). Does that match the data that you're parsing? I see FIPS codes in that Excel for more than just oil and gas.

shawnbot commented 8 years ago

Moving this back into in-progress so I can review the data.

shawnbot commented 8 years ago

This is indeed a data problem. Our county revenue data was out of date, but FWIW, it was the original ONRR data that lacked the FIPS and not a pipeline problem.