NYCPlanning / db-developments

🏠 🏘️ 🏗️ Developments Database
https://nycplanning.github.io/db-developments
8 stars 2 forks source link

Export A2 jobs that involve a change in units #581

Closed SashaWeinstein closed 1 year ago

SashaWeinstein commented 1 year ago

549 Medium PR, two reviewers probably best 🍖

This PR builds off work on branch 549-A2-Jobs. That branch has the template that includes A2 jobs in the pull of BIS data from socrata to data library We will wait for housing to review the export of UNITS_A2_devDB before we move forward. This PR will be a draft until then. We aren't sure why so many A2 jobs have proposed class A units, hoping the housing team can help us figure that out

td928 commented 1 year ago

it seems like A2 jobs addition is making the build much harder probably expected but definitely something to think about. Also it makes me realize that the _status.sql needs a look for the permitting side of things and consequently the permitting template to pull permits data also needs to be adjusted.

SashaWeinstein commented 1 year ago

When you say harder I assume you mean longer runtimes? It definitely does make the build take longer. I think one solution is to filter out more A2 jobs higher up. Figured we would wait until getting more feedback from housing to implement that.

td928 commented 1 year ago

When you say harder I assume you mean longer runtimes? It definitely does make the build take longer. I think one solution is to filter out more A2 jobs higher up. Figured we would wait until getting more feedback from housing to implement that.

yea that's what I meant. Also question for housing to answer is whether the efforts worth the incremental benefits of including those units and massive amount of manual research this might require.

td928 commented 1 year ago

When you say harder I assume you mean longer runtimes? It definitely does make the build take longer. I think one solution is to filter out more A2 jobs higher up. Figured we would wait until getting more feedback from housing to implement that.

yea that's what I meant. Also question for housing to answer is whether the efforts worth the incremental benefits of including those units and massive amount of manual research this might require.

SashaWeinstein commented 1 year ago

both great points Te

SashaWeinstein commented 1 year ago

will you let housing know that we have output ready for their review, or should I?

td928 commented 1 year ago

will you let housing know that we have output ready for their review, or should I?

I am waiting for the new run to finish and then letting them know. Also this comment still stands regardless what housing says

it makes me realize that the _status.sql needs a look for the permitting side of things and consequently the permitting template to pull permits data also needs to be adjusted.

SashaWeinstein commented 1 year ago

Sorry I don't understand the thing about the permits

SashaWeinstein commented 1 year ago

There is an important piece of outstanding work here, which is to include A2 permits. The permit issuance template will have to be updated, data sync re-run from the feature, and DOB_DATA_DATE in version.env updated to point to that run of data sync

td928 commented 1 year ago

There is an important piece of outstanding work here, which is to include A2 permits. The permit issuance template will have to be updated, data sync re-run from the feature, and DOB_DATA_DATE in version.env updated to point to that run of data sync

also highlighting this secondary task for pulling permits so we don't forget about it.

SashaWeinstein commented 1 year ago

My understanding of where this stand now is that we are waiting for housing to get back to us with their manual research on A2 jobs. Want to confirm that this is correct

td928 commented 1 year ago

Some update from housing about this. In the spreadsheet attached is the join between the A2 units we pulled and their corrections from before. If you set the last 'In EDM' filter to No, you will see the records they found but not included in the list we pulled. I took a quickly look it seems somehow the "Combine" is not getting pick up. UNITS_A2_devdb (1) (2).csv

@SashaWeinstein

SashaWeinstein commented 1 year ago

I looked at a couple of the records marked with In EDM = No, and saw think some are being are filtered out because they aren't marked as residential in this CASE statement that assigns the resid_flag

SELECT
    *,
    (CASE 
        WHEN (hotel_init IS NOT NULL AND hotel_init <> '0')
            OR (hotel_prop IS NOT NULL AND hotel_prop <> '0')
            OR (otherb_init IS NOT NULL AND otherb_init <> '0')
            OR (otherb_prop IS NOT NULL AND otherb_prop <> '0')
            OR (classa_init IS NOT NULL AND classa_init <> '0')
            OR (classa_prop IS NOT NULL AND classa_prop <> '0')
            THEN 'Residential' 
    END) as resid_flag
INTO _UNITS_devdb
FROM _UNITS_devdb_raw

We use this flag to filter when we create the UNITS_A2_devdb and we shouldn't.

However I see 1100 records that housing flagged as not in EDM, and only 224 get a null resid on my machine. So I think this is one fix but not the only one

SashaWeinstein commented 1 year ago

Ok after some more investigation it seems that 918 of the 1100 records that housing flags are being filtered out because of the resid_flag filter. I'm going to remove that filter and re-run on github actions. Te how many of those records do we want to make sure get included? Is cutting it down from 1100 -> 182 ok or should we keep working on getting those 182 in?

td928 commented 1 year ago

Ok after some more investigation it seems that 918 of the 1100 records that housing flags are being filtered out because of the resid_flag filter. I'm going to remove that filter and re-run on github actions. Te how many of those records do we want to make sure get included? Is cutting it down from 1100 -> 182 ok or should we keep working on getting those 182 in?

This is really great progress. I guess I do want to have a handle on what those 182 are before making a call about whether to continue or not. Should we do a quick checkin in the afternoon if you are available?

SashaWeinstein commented 1 year ago

Yep sounds good just finishing up my lunch now, calendar is open to meet and discuss more

SashaWeinstein commented 1 year ago

Two jobs in our _INIT_BIS_devdb table that housing team identified as records we should have included but didn't make it to the UNITS_A2_devdb table: 321048488 and 122412709

SashaWeinstein commented 1 year ago

@td928 there a couple things to clean up, when I address those I'll re-request your review

SashaWeinstein commented 1 year ago

Addressed some small things and asked a clarifying question on the A2 units table we export