Open hrecht opened 4 months ago
To test the usage of these named predicates, I retrieved variable metadata for all timeseries and aggregate endpoints with listCensusMetadata()
. Here's how often each are used:
> param_vars %>% count(name, sort = T)
name n
1 YEAR 299
2 NAICS2012 99
3 category_code 20
4 data_type_code 20
5 NAICS2007 19
6 NAICS2002 17
7 NAICS1997 16
8 SIC 16
9 NAICS 12
10 MONTHLY 7
11 DATE 4
12 PERIOD 3
13 year 3
14 CATEGORY_CODE 1
15 DATA_TYPE_CODE 1
16 PSCODE 1
Note that in almost all cases the predicates are actually uppercase, not lowercase. getCensus()
coerces all but data_type_code
and category_code
to uppercase in the request construction. The timeseries/qwi/sa
, timeseries/qwi/se
, timeseries/qwi/rh
endpoints use lowercase year
as a predicate. (Documentation: https://www.census.gov/data/developers/data-sets/qwi.html)
This coercion to uppercase results in the following code failing because the required year
predicate truly is lowercase here:
qwi <- getCensus(
name = "timeseries/qwi/sa",
vars = "Emp",
region = "state:02",
year = 2021,
quarter = 1)
# Error in apiCheck(req) :
# The Census Bureau returned the following error message:
# error: unknown predicate variable: 'YEAR'
# Your API call was: https://api.census.gov/data/timeseries/qwi/sa?key=[KEY]&get=Emp&for=state%3A02&YEAR=2021&quarter=1
Deprecating these named parameters is now also a bug fix to avoid this error.
Several years ago, when there were FAR less Census Bureau API endpoints, I added some optional parameters to
getCensus()
that were convenience options for some of the economic data endpoints. The package supported the use arbitrary parameters (predicates, in Census-speak) since v0.6.0, released on CRAN in 2019, so this is both unnecessary and unwieldy. Also, catering specifically to certain endpoints in this function is not in scope of the package.The named parameters are:
c("year", "date", "period", "monthly", "category_code", "data_type_code", "naics", "pscode", "naics2012", "naics2007", "naics2002", "naics1997", "sic")
year
was added to the list in v0.7.2 when some of the endpoints were swinging back and forth between usingtime
versusyear
. They are now more consistent and many of the timeseries APIs use lowercasetime
as a required predicate.YEAR
is a variable name in hundreds of the non-timeseries APIs.The vast majority of this list are actually UPPERCASE predicates in the APIs, not lowercase.
Users will still be able to use the full functionality of the APIs with arbitrary parameters without having these as named optional parameters.
For example, use uppercase
YEAR
here instead of lowercaseyear
. But really, the preferred syntax now istime
, which is the endpoint's true predicate for filtering the timeseries.