NYCPlanning / db-developments

🏠 🏘️ 🏗️ Developments Database
https://nycplanning.github.io/db-developments
8 stars 2 forks source link

date_lastupdt and job_inactive status #546

Closed td928 closed 1 year ago

td928 commented 2 years ago

date_lastupdt is a DOB created field and it is noted by Housing that some of those dates are not necessarily the "most recent" date among all the date fields provided by DOB.

One major downstream impact this has is the job_inactive logic depends on the date_lastupdt to generate the Inactive: Stalled status on our end. which is then used by housing for research purposes and also quite extensively in creating the aggregate table.

Two approaches to fix the issue could be 1) create our own version of date_lastupdt to simply create the max of all the date fields provided by DOB. 2) only change the logic to create the job_inactive to take into account that the DOB date_lastupdt might not be a true latest date.

AmandaDoyle commented 2 years ago

Don't change logic for date last update Change logic for job inactive to use max for x number of date fields Do some research on if date of last update is very wrong.

SashaWeinstein commented 2 years ago

I did some research as to how many records have an incorrect date_lastupdt and get the Inactive: Stalled value in job_inactive.

with bad_date_lastupdt as (SELECT * from _MID_devdb 
where GREATEST(date_lastupdt::date, date_filed::date, date_statusd::date, date_statusp::date, date_statusr::date, date_statusx::date, date_complete::date ) > date_lastupdt::date)
SELECT a.* from STATUS_devdb a inner join bad_date_lastupdt b on a.job_number = b.job_number where a.job_inactive = 'Inactive: Stalled'

This returned 79 records. I'm going to change the logic to set that Inactive: Stalled value to instead to use the greatest of the those seven column and see how that changes the results