sfcpc / housing-dashboard

4 stars 0 forks source link

[Schemaless] Need to group PTS permit data by mapblocklot, filed_date, proposed_use #97

Closed rajpara11 closed 4 years ago

rajpara11 commented 4 years ago

See issue 95 for more info. in order to get all the permits associated with a project we will have to group by mapblocklot, filed_date, and proposed_used.

We can either do this in our uuid_map logic (if we want it to affect our dashboard numbers) or in our separate matching script (if we just want PPTS to be updated with all associated building permits maybe?)

rajpara11 commented 4 years ago

Note that this will involve joining on Parcels data (https://data.sfgov.org/Geographic-Locations-and-Boundaries/Parcels-Active-and-Retired/acdm-wktn/data). Since we don't have any data in our schemaless right now that isn't associated to 'projects', we'd have to decide how we want to do this (maybe just a preprocessing step that adds mapblocklot to PTS data?)

rajpara11 commented 4 years ago

From Alton's e-mail, the updated way we should be doing this is:

1.  Associate mapblocklot data to PTS by joining against Parcels data set (translates block lot to map block lot)
2.  Group all permits where filed_date, mapblocklot and proposed_use is the same
3.  Filter permits that are permit type 1, 2 or 3 with net_units > 0
4.  Only take the first when there are duplicate permit_numbers (first of series of building permit numbers)
5.  Remove permits with status ‘withdrawn’ or ‘cancelled’
rajpara11 commented 4 years ago

This has been implemented using other issues and PR's