NYCPlanning / data-engineering

Primary repository for NYC DCP's Data Engineering team
23 stars 0 forks source link

DE<>GIS: COLP Packaging #351

Open alexrichey opened 1 year ago

alexrichey commented 1 year ago

Automate the transformations that GIS applies to COLP, as outlined here

Macro Steps:

Individual Transformations to call out:

damonmcc commented 11 months ago

wondering if MyST markdown is a potential way to generate PDFs from markdown

damonmcc commented 10 months ago

outline of packaging stages from GIS doc

  1. Download and extract the full output.zip file to \COLP{YYYY}{ YYYYMMDD}\raw_data
    • the shp and gdb data are contained within their own sub-archives. Leave these as-is until ready to process those files
  2. Process each dataset in the relevant subfolder
    • GDB: extract from \raw_data directly to \gdb and rename using the following conventions
      • GDB name colp_YYYYMMDD.gdb (e.g. colp_20230802.gdb)
      • Feature class name colp_YYYYMMDD (e.g. colp_20230802)
    • SHP: extract from \raw_data directly to \shp and rename using the following conventions
      • colp_YYYYMMDD.shp (ex. colp_20230802.shp)
    • CSV: copy from \raw_data directly to \csv and rename using the following conventions
      • colp_YYYYMMDD.csv (ex. colp_20230802.csv)
    • XLSX: convert csv to xlsx
      • ...
  3. Update readme file from Sharepoint and send to Matt/Amanda for review
  4. Receive confirmation from Matt/Amanda that readme is ready for posting
  5. Update gdb and shp metadata
  6. Zip the csv, xlsx, gdb, and shp files individually with both readme and metadata pdf
  7. Save in M:\GIS\BytesProduction\COLP\<YYYY>\< YYYYMMDD>\web
  8. Manually export the gdb to SDE Prod
damonmcc commented 10 months ago

thoughts on product release stages