Closed SashaWeinstein closed 2 years ago
Tested on my local and results look good to me. One minor change I want to make is the pre_corr_val
is depreciated now since neither the correction by id nor correction by value workflow consider "pre corrected value" anymore before making the change. So I am deleting it out of all the table to avoid any confusion (they would be all null
in any case). See the commit below for this change.
@AmandaDoyle the new commit made above should have the updated csv and also logic to do corrections based on whether the uid is null or not.
Ran on my machine with some extra print statements and got the expected behavior
a minor change above for where deleting is happening for the applied corrections see @abrieff comments above. Tested on my local and results are still good. I think it should be good to merge in and kick off production since the change is minor compared to Amanda's approved version. @SashaWeinstein
Fix modification process
Issue #198 outlines the issue this PR addresses. The fix requires adding more logic to an already convoluted
_procedures.sql
. I think this code can be made shorter and simpler with a bigger refactor, but that's outside the scope of getting this build out. Te and I worked on all this code together, pair coding through all the development and testing.Split modifications table into
by_id
andby_value
As @abrieff pointed out, this step isn't necessary and could be achieved with a query within the logic of
procedures.sql
. However we chose this more verbose approach as it was easier to check our work along the way with postico. Instead of one call ofapply_correction
in the build step there are now two calls, one for each table.New logic in
_procedures.sql
Two if/else blocks were added to apply the correct logic based on the parameters passed to the procedure call. The first new if/else block is to set the
records_to_recode
boolean (which replaces theapplicable
boolean). If the recode is by uid, it checks if there are any records with the given uid. If the recode is by value, it checks if there are any records with the given value for the given field. If there are any records to be recoded,records_to_recode
is set to true. The second new if/else block determines how records are recoded. If the recode is by uid, the new value is set for the given field for all records with a matching uid. If the recode is by value, then all records with a matchingold_value
for the given field are recoded. No changes are made to the logic in situations in which there are zero records to recode.Testing
We tested the code by running the modifications table as is, and also changing the
uid
in the modifications by id table to the actual uid of an existing record to see if it successfully recoded the value. In both situations the code appeared to work.Misc changes
The call of
load_modifications.sql
was moved from the dataloading step to the build step as it's logic we want to run each time we build. I think it should live in build as more logic has been added.