Closed mbh329 closed 1 year ago
A bit of cleanup - I would just delete the old scq sql files (5a, 5c, etc), they're checked in to main from the last PR if we really need to find them. And maybe update the comments (or mostly delete) at the top of our sca aggregate files
Seems to all be building cleanly for me locally 👍
Copying from #403
After fixes in said PR, 9 records are still having no boro assigned
source | record_id | record_name | |
---|---|---|---|
HPD Projected Closings | 3d3a2dd3c6ec864f5e18cec8e03eb2a7 | 784 COURTLANDT AVENUE | |
HPD Projected Closings | ff4c3b3e09d4c3a0d527a5c7988242d0 | 1559 PROSPECT PLACE | |
HPD RFPs | 0ce4a0321f576ad9b397dec9358f255f | MWBE Site B - 1921 Atlantic Ave | |
HPD RFPs | 100c52f92560bc161659f99b08340e9f | 97 West 169 St | |
HPD RFPs | 1db4d18c803723face138966e70095d8 | NYCHA West Brighton RFP | |
HPD RFPs | 473c582483789bb07b5837ecdb746ae4 | Bedford-Stuyvesant Community Wealth and Wellness RFP (Fulton-Saratoga) | |
HPD RFPs | 862c3d3640912f5ec6561375a03449ef | Hunters Point F + G | |
HPD RFPs | 9bd199e88bb8dd9b128d0d3c71b520b5 | Hunters Point F + G | |
HPD RFPs | defae821da6ac0b1a7610be5c7d5f484 | SustaiNYC (E. 111th Street) |
We'll likely need to add these to the corrections
Merged my changes into this branch @mbh329. I'm going to rebuild from scratch locally now on this branch
Linking @AmandaDoyle's comment on my PR here - waiting on confirmation but then we'll need to apply corrections
rebuilding locally now @fvankrieken
ran into a storage issue, cleaning stuff up and will try to rerun
if this is a WIP, should we convert it to a draft PR and re-request review when it's ready?
LGTM especially considering the urgency!
is it worth running locally before merging to ensure the csvs in
sca_output/
are definitely the result of the latest code changes? looks like they were generate 2 days ago
Yes, we should commit latest generated sca outputs. Or @mbh329 are those supposed to go in the data repo instead?
I've been generating them separately from the data repo, I think we can commit the most recent sca agg tables and then merge. Would you like to do that or should I? I have the latest ones from this morning but can rerun to make sure
I've been generating them separately from the data repo, I think we can commit the most recent sca agg tables and then merge. Would you like to do that or should I? I have the latest ones from this morning but can rerun to make sure
I just dropped my local kpdb db to make room for edde so please do!
rebuilding now with latest data/corrections
Taking longer than I would like, ran into some more obtuse memory issues but building now
I didn't have any issue earlier - do you want me to build locally?
got it going, on the sca_aggregate step now
Still running the aggregate step but the corrections are not being applied to the 9 records without a borough. I double checked that I have the most recent data and I can confirm that the corrections_main table is up to date
confirming that the units_gross
corrections (the ones before our borough corrections) are being successfully applied to the kpdb
table
@AmandaDoyle @fvankrieken @damonmcc
ran this and everything looked good on my local build
motivations
04_sca_aggregate.sh
to handle the execution and creation of the scripts and subsequent output files (longform_csd_output
,longform_es_zone_output
,longform_subdist_output_cp_assumptions
). As of now this is NOT incorporated into the build workflow/Github Action and can only be done locally.changes
kpdb
->_kpdb
,Doe_school_zones_es_2019
->doe_eszones
)zap_project_many_bbls
that was needed and appeared in each script, runs the each of the 3 scripts (boundaries_es_zone.sql
,boundaries_school_districts.sql
, andboundaries_school_subdistricts.sql
) and then exports the CSV's. I decided to keep this separate from the existingbuild
andexport
process because this is typically the very last thing that DE or Housing does. I don't think it makes sense to run the scripts and write new outputs each time we rebuild KPDB locally.dev
branch on the main KPDB repo with the latestdata
submodule to avoid confusion. This has been signed off by housing as the "final' version of KPDB with all the latest corrections.dcp_boroboundary
file withoutwater included
and usedst_intersect
instead ofst_within
. The motivation behind this was "The issue was that when I was usingst_intersect
with the water included boundaries, Manhattan's water boundary is right up to the land boundary of Brooklyn so all the records on the BK waterfront with overlap into Manhattan were getting assigned to Manhattan".notes
todo