NYCPlanning / db-colp

City Owned and Leased Properties Database (COLP)
0 stars 0 forks source link

Correct by value or by id #199

Closed SashaWeinstein closed 2 years ago

SashaWeinstein commented 2 years ago

Fix modification process

Issue #198 outlines the issue this PR addresses. The fix requires adding more logic to an already convoluted _procedures.sql. I think this code can be made shorter and simpler with a bigger refactor, but that's outside the scope of getting this build out. Te and I worked on all this code together, pair coding through all the development and testing.

Split modifications table into by_id and by_value

As @abrieff pointed out, this step isn't necessary and could be achieved with a query within the logic of procedures.sql. However we chose this more verbose approach as it was easier to check our work along the way with postico. Instead of one call of apply_correction in the build step there are now two calls, one for each table.

New logic in _procedures.sql

Two if/else blocks were added to apply the correct logic based on the parameters passed to the procedure call. The first new if/else block is to set the records_to_recode boolean (which replaces the applicable boolean). If the recode is by uid, it checks if there are any records with the given uid. If the recode is by value, it checks if there are any records with the given value for the given field. If there are any records to be recoded, records_to_recode is set to true. The second new if/else block determines how records are recoded. If the recode is by uid, the new value is set for the given field for all records with a matching uid. If the recode is by value, then all records with a matching old_value for the given field are recoded. No changes are made to the logic in situations in which there are zero records to recode.

Testing

We tested the code by running the modifications table as is, and also changing the uid in the modifications by id table to the actual uid of an existing record to see if it successfully recoded the value. In both situations the code appeared to work.

Misc changes

The call of load_modifications.sql was moved from the dataloading step to the build step as it's logic we want to run each time we build. I think it should live in build as more logic has been added.

td928 commented 2 years ago

Tested on my local and results look good to me. One minor change I want to make is the pre_corr_val is depreciated now since neither the correction by id nor correction by value workflow consider "pre corrected value" anymore before making the change. So I am deleting it out of all the table to avoid any confusion (they would be all null in any case). See the commit below for this change.

td928 commented 2 years ago

@AmandaDoyle the new commit made above should have the updated csv and also logic to do corrections based on whether the uid is null or not.

SashaWeinstein commented 2 years ago

Ran on my machine with some extra print statements and got the expected behavior

td928 commented 2 years ago

a minor change above for where deleting is happening for the applied corrections see @abrieff comments above. Tested on my local and results are still good. I think it should be good to merge in and kick off production since the change is minor compared to Amanda's approved version. @SashaWeinstein