Closed jbousquin closed 1 year ago
Not the most elegant, but suggesting something like:
data=[]
n_chunks = np.ceil(len(variables) / 50)
for chunk in np.array_split(variables, n_chunks):
joined_vars = ",".join(chunk)
params.update({'get': joined_vars})
req = requests.get(url = base, params = params)
if req.status_code != 200:
raise SyntaxError(f"Request failed. The Census Bureau error message is {req.text}")
out = pd.read_json(req.text)
out.columns = out.iloc[0]
out = out[1:]
data+=[out] # Add output from each chunk to list
out = pd.concat((data), sort=False, axis=1)
@jbousquin makes sense to me! This is similar to what we do in tidycensus as well. I wrote get_census()
as a minimal interface to the API that I'd refine if people started using it. I'll take a look at integrating this, or feel free to submit a PR if you'd like.
So I don't think the above solution will work - see https://github.com/hrecht/censusapi/issues/82 and https://github.com/walkerke/tidycensus/pull/165. The problem is that the sort order of rows is not always consistent across data pulls.
I'll poke around at this and come up with a workable solution.
API won't return more than 50 variables at once, error is descriptive enough:
Suggested enhancement is to split the variables over multiple requests (similar to cenpy, but splitting over variables and concat on cols vs rows).