Add ZoningUG_init and ZoningUG_prop columns

mbh329 commented 1 year ago

This issue addresses issue #592. Two reviewers :v:

About

Housing requested we add two additional columns of data: ZoningUG_init (Existing Zoning Use Group) and ZoningUG_prop (Proposed Zoning Use Group) from a cut of the dob_now_applications data we receive directly from DOB. Specifically, DCP housing team asked us to clean the data, put it in an array and get only distinct zoning use group values.

One of the issues that Sam and I went over in review, is that the data is very messy and not necessarily what he was expecting but that the data mirrors whats on dob now website so they are "expected" values in that sense. There are a few things to point out about the data:

This may be obvious but new buildings in DOB now won't have an existing zoning use group...because they didn't exist!
Buildings can have multiple different types of zoning use groups (both initial and proposed)

Some issues with the data we got from DOB

1a could be 1A OR 1-A OR A1 (which are from my understanding the the same zoning use group)
The data wasn't consistently separated by comma's (e.g. 1A & 1b, 2c)
For whatever reason, there are sometimes text/sentences in the column (e.g. 1a, 2c & why does this column contain random text, 10-a)

I tried to take care of these use cases with the code but if there are any additional suggestions, would be interested in hearing them.

Testing Code

To test the data, you will have to set the version of dob_now_applications to 20221001. This dataset was not ingested through data library as latest as we are not sure if this will be the "stable" data moving forward and we don't have an updated dob_now_permits data from DOB (this is important because we want dob_now data to be synced/up to date with each other).

The new logic was added to the sql/now/_init.sql script where the dob_now_applications data is transformed. Dummy columns of the same name were also added to sql/bis/_init.sql in order to successfully implement the UNION in the sql/_init.sql. After that all the columns were added to the necessary intermediary tables and the final devdb output.