MattTriano / analytics_data_where_house

An analytics engineering sandbox focusing on real estates prices in Cook County, IL
https://docs.analytics-data-where-house.dev/
GNU Affero General Public License v3.0
7 stars 0 forks source link

Develop a prototype representation for a Census API dataset #149

Closed MattTriano closed 1 year ago

MattTriano commented 1 year ago

For a rough sketch of a Census API dataset, I need

Here are such components for a dataset with concept: "MEANS OF TRANSPORTATION TO WORK BY TRAVEL TIME TO WORK"

identifier = "https://api.census.gov/data/id/ACSDT5Y2021"

dataset_df = api_metadata_df.loc[api_metadata_df["identifier"] == identifier].copy()
dataset_vars_df = api_vars_metadata_df.loc[api_vars_metadata_df["identifier"] == identifier].copy()
dataset_geo_df = api_geo_metadata_df.loc[api_geo_metadata_df["identifier"] == identifier].copy()

dataset_url = dataset_df["distribution_access_url"].values[0]
variables = [
    "GEO_ID", "TRACT",
    "B08134_001E", "B08134_002E", "B08134_003E", "B08134_004E", "B08134_005E", "B08134_006E", "B08134_007E", "B08134_008E", "B08134_009E", "B08134_010E",
    "B08134_061E", "B08134_062E", "B08134_063E", "B08134_064E", "B08134_065E", "B08134_066E", "B08134_067E", "B08134_068E", "B08134_069E", "B08134_070E",
    "B08134_071E", "B08134_072E", "B08134_073E", "B08134_074E", "B08134_075E", "B08134_076E", "B08134_077E", "B08134_078E", "B08134_079E", "B08134_080E",
    "B08134_081E", "B08134_082E", "B08134_083E", "B08134_084E", "B08134_085E", "B08134_086E", "B08134_087E", "B08134_088E", "B08134_089E", "B08134_090E"
]
geog = "tract:*"
geog_in = {
    "state": "17",
    "county": "031",
}
api_key="PLACEHOLDER_API_KEY"
geog_predicate = "&in=".join([f"{k}:{v}" for k, v in geog_in.items()])
f"""{dataset_url}?get={",".join(variables)}&for={geog}&in={geog_predicate}&key={api_key}"""

http://api.census.gov/data/2021/acs/acs5?get=GEO_ID,TRACT,B08134_001E,B08134_002E,B08134_003E,B08134_004E,B08134_005E,B08134_006E,B08134_007E,B08134_008E,B08134_009E,B08134_010E,B08134_061E,B08134_062E,B08134_063E,B08134_064E,B08134_065E,B08134_066E,B08134_067E,B08134_068E,B08134_069E,B08134_070E,B08134_071E,B08134_072E,B08134_073E,B08134_074E,B08134_075E,B08134_076E,B08134_077E,B08134_078E,B08134_079E,B08134_080E,B08134_081E,B08134_082E,B08134_083E,B08134_084E,B08134_085E,B08134_086E,B08134_087E,B08134_088E,B08134_089E,B08134_090E&for=tract:*&in=state:17&in=county:031&key=PLACEHOLDER_API_KEY

I'll probably want to make a class that includes logic for formatting more geographies or adding predicates to variables; anything to shield the user from complexity.

Reference, Census API User Guide