ccao-data / data-architecture

Codebase for CCAO data infrastructure construction and management
https://ccao-data.github.io/data-architecture/
5 stars 3 forks source link

Add Chicago CBD boundary to PIN universe #368

Closed dfsnow closed 2 months ago

dfsnow commented 3 months ago

We need to add an indicator for whether or not a PIN is in the Chicago Central Business District to PIN universe and the various downstream model views.

The CBD shape file can be found here: https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Boundaries-Central-Business-District/tksj-nvsw

Steps

  1. Create a short ingest script to grab the shape file and put it in the raw S3 data bucket
  2. Add a cleaning script that moves the raw data to the warehouse and creates a table in the spatial database
  3. Perform a spatial intersection of PIN locations, copying the pattern found in existing spatial CTAS
  4. Update the location views (including the filled view)
  5. Update the PIN universe view
  6. Update the model views that are downstream of the location views
Damonamajor commented 3 months ago
  1. AWS S3 - ingest the raw data in spatial.economy.R. Mimic the existing structure, add the url into a new chunk of code. Puts that in the raw bucket in the S3.
  2. Create a warehouse script in spatial-economy.R. This will have a clean function particular to the data. This then re-uploads it as a parquet file into the warehouse.
  3. Once in the warehouse, go into aws-athena, to ctas, - location_economy.sql where you will add a section for the new entity,
  4. Add that to views - location-vw-pin10-location.sql - vwpin10 location fill - default-vw-pin-universe.