hrecht / censusapi

R package to retrieve U.S. Census data and metadata via API
https://www.hrecht.com/censusapi/
169 stars 31 forks source link

Add optional NAICS2012 parameter #22

Closed hrecht closed 6 years ago

hrecht commented 7 years ago

https://www.census.gov/data/developers/data-sets/economic-census.html https://www.census.gov/data/developers/data-sets/nonemployer-statstics-and-county-business-patterns.html

mattwilliamson13 commented 6 years ago

Hi, Is this issue related to being able to specify a particular NAICS code (or codes) to avoid downloading excess CBP data? Or is there a way to do that that I'm just not figuring out? Let me know and I can open a new issue (if this is the wrong place for that). Thanks!

hrecht commented 6 years ago

Hi @mattwilliamson13, are you having a NAICS issue? What's the code that you're trying to run/data you're trying to get? The API endpoints linked in this issue aren't 100% supported yet because their structure is different from others - but simple calls should work. I can bump that up the priority list if there's interest in it.

mattwilliamson13 commented 6 years ago

Hi @hrecht, thanks for getting back to me. I've been running the following:

cbp.1986.AL <- getCensus(name="cbp", vintage="1986", key = mykey,
                         vars = as.character(cbp.vars$name),
                         region = "county:27", regionin = "state:1")

This works fine; however, I need to do this for all counties which results in an error:

cbp.1986.AL <- getCensus(name="cbp", vintage="1986", key = mykey,
                        vars = as.character(cbp.vars$name), region = "county:*", regionin = "state:1")
Error: error: estimated query results exceed cell limit of 981228

I would like to be able to download the data for a few subsectors (which should reduce the number of cells necessary) based on the appropriate NAICS code (for a given endpoint), but am not sure if I can do this or how to do it. This may not actually be related to the issue you have open here which is why I wanted to check with you.

hrecht commented 6 years ago

Re this original issue: NAICS parameters that are required for the EWKS API and optional for the CBP APIs have been added in https://github.com/hrecht/censusapi/commit/0fc5dcf364d2a85d1bb4ada63f11eb28618c448b for the next release.

NAICS codes can now be specified in the business patterns and ewks APIs. The argument for NAICS codes depends on the vintage of data being used - NAICS2012, NAICS2007, NAICS2002, NAICS1997, or SIC. The best way to determine which one to use is the API endpoint's documentation, e.g. https://www.census.gov/data/developers/data-sets/cbp-nonemp-zbp/cbp-api.2005.html

New functionality: This API call https://api.census.gov/data/1986/cbp?get=ESTAB,SIC_TTLGEO_TTL&for=county:*&in=state:01&SIC=70 can now be run using the latest development version of censusapi.

cbp_1986 <- getCensus(name = "cbp",
  vintage = "1986",
  vars = c("ESTAB", "SIC_TTL", "GEO_TTL"),
  region = "county:*",
  regionin = "state:1",
  sic = "70")

@mattwilliamson13 I hope this addresses your issue! Please let me know if this works.

mattwilliamson13 commented 6 years ago

Thanks @hrecht! Sorry for the delay in getting back to you. This seems to do what I need. Really appreciate your help.