Closed sharifamlani closed 4 years ago
Potential Solution
# Return API's built in error message if invalid call
apiCheck <- function(req) {
if (req$status_code==400) {
error_message <- (gsub("<[^>]*>", "", httr::content(req, as="text")))
if (error_message == "error: missing 'for' argument") {
stop("This dataset requires you to specify a geography with the 'region' argument.")
}
stop(paste("The Census Bureau returned the following error message:\n", error_message,
"\n Your API call was: ", print(req$url)))
}
# Some time series don't give error messages, just don't resolve (e.g. SAIPE)
if (req$status_code==204) stop("204, no content was returned.\nSee ?listCensusMetadata to learn more about valid API options.", call. = FALSE)
if (identical(httr::content(req, as = "text"), "")) stop(paste("No output to parse. \n Your API call was: ", print(req$url)), call. = FALSE)
}
apiParse <- function (req) {
if (jsonlite::validate(httr::content(req, as="text"))[1] == FALSE) {
error_message <- (gsub("<[^>]*>", "", httr::content(req, as="text")))
stop(paste("The Census Bureau returned the following error message:\n", error_message, "\nYour api call was: ", req$url))
} else {
raw <- jsonlite::fromJSON(httr::content(req, as = "text"))
}
}
# Function to clean up column names - particularly ones with periods in them
cleanColnames <- function(dt) {
# No trailing punct
colnames(dt) <- gsub("\\.[[:punct:]]*$", "", colnames(dt))
# All punctuation becomes underscore
colnames(dt) <- gsub("[[:punct:]]", "_", colnames(dt))
# Get rid of repeat underscores
colnames(dt) <- gsub("(_)\\1+", "\\1", colnames(dt))
return(dt)
}
responseFormat <- function(raw) {
# Make first row the header
colnames(raw) <- raw[1, ]
df <- data.frame(raw)
df <- df[-1,]
df <- cleanColnames(df)
# Make all columns character
df[] <- lapply(df, as.character)
# Make columns numeric if they have numbers in the column name - note some APIs use string var names
# For ACS data, do not make columns numeric if they are ACS annotation variables - ending in MA or EA or SS
# Do not make label variables (ending in _TTL) numeric
value_cols <- grep("[0-9]", names(df), value=TRUE)
error_cols <- grep("MA|EA|SS|_TTL|_NAME|NAICS2012|NAICS2012_TTL|fage4|FAGE4", value_cols, value=TRUE, ignore.case = T)
for(col in setdiff(value_cols, error_cols)) df[,col] <- as.numeric(df[,col])
row.names(df) <- NULL
return(df)
}
################ Here is an updated getCensus2 code that worked #################
#Note: I have tested it on American Communities Survey and County Business Patterns for Congressional Districts. Updated and improvements welcome.
getCensus2 <- function(name, vars, region, vintage, key = Sys.getenv("CENSUS_KEY")){
vars1 <- paste(c(vars), collapse = ",")
API_URL <- paste("https://api.census.gov/data/", vintage, "/", name, "?get=", vars1, "&for=", region, "&key=", key, sep = "")
x <- httr::GET(API_URL)
# Check the API call for a valid response
apiCheck(x)
# If check didn't fail, parse the content
raw <- apiParse(x)
# Format the response into a nice data frame
df <- responseFormat(raw)
return(df)
}
Hi there, change the %20 in your call to a space. The package takes care of URL encoding as needed. This works for me.
CD_Pure <- getCensus(name = "acs/acs1/spp",
vars = "S0201PR_0093E",
region = "congressional district:*",
vintage = 2016)
In the future you can see the geographies available using geos <- listCensusMetadata(name = "acs/acs1/spp", vintage = 2016, type = "geographies")
Use the name exactly as it's written in the name
column in the response. Hope that helps!
Ahh great! Thank you so much for the response. Yes, the code you provided works perfectly for me. I appreciate all your help and the helpful tip as well!
Describe the bug When using the following code to call data from the American Communities survey
The package's API call was:
Notice that although I am calling
region = "congressional%20district:*"
in theregion
arguement in the functions, thefor
call on the API's request iscongressional%2520district
.This bug produces the following error:
Error in apiCheck(req) : The Census Bureau returned the following error message: error: invalid 'for' argument
To Reproduce To compare, when I call directly to the API using the same parameters:
I am successfully able to communicate with the API
Date: 2020-06-12 22:15 Status: 200 Content-Type: application/json;charset=utf-8 Size: 10.6 kB [["S0201PR_0093E","S0201_0123E","state","congressional district"], [null,"3.9","01","01"], [null,"7.4","01","02"], [null,"7.1","01","03"], [null,"4.2","01","04"], [null,"6.4","01","05"], [null,"4.9","01","06"], [null,"5.1","01","07"], [null,"7.2","02","00"], [null,"8.1","04","01"],
Expected behavior The issue in the function is that when
region
is set to congressional districts (congressional%20district:*
) in the function, the API call includes thefor
argumentcongressional%2520district
. Instead, the API call should only includecongressional%20district
. `R session information:
Additional context Note, I use
congressional%20district
because that is what is given in the example API call on the Census website.Thanks so much for making the package! I really appreciate it.