NYCPlanning / db-developments

🏠 🏘️ 🏗️ Developments Database
https://nycplanning.github.io/db-developments
8 stars 2 forks source link

dob geocoded join issue #541

Closed td928 closed 2 years ago

td928 commented 2 years ago

539 1 reviewer is good 🐠

overview

Solution to the problem is actually really simple and totally self-inflicted (I removed the DOB_DATA_DATE in an earlier commit when I can't get data sync to run correctly.

I think there will be a follow up PR after this is merged in to dev and we can look at the results with qaqc app. Then another one points to the main for rerunning the final output.

Review Tips

To test locally, I recommend rerunning the dob part of the dataloading. One thing to pay attention to here is the dob_geocoded_results would get loaded and renamed to _geo_devdb. Something about the primary key constraint might prevent you from reloading it properly if you have a local build. I would recommend deleting the _geo_devdb before trying run the dataloading.

Once build finish I think there are few ways to check. One is to run the test comparing final bbl and job_number because they should now be in the same borough (first digit is the same). There should be only 6600 records and most of them dob now records which job number starts with a letter instead of .

Another way is to look into the final_qaqc table where the flags dem_bn_overlap and dup_bbl_address_units now should go back to on par with 20Q4. Of course, this might be easier done when output is rerun on dev and we can use the qaqc app for that.

SashaWeinstein commented 2 years ago

I want to flag that the manual corrections table doesn't match as PR #538 was merged to main and not to dev. I think we should open a PR to get that change into dev before we merge this. I can do that now

SashaWeinstein commented 2 years ago

I also think this feature branch was branched off main instead of dev, maybe solution is rebase instead

td928 commented 2 years ago

I think it is totally fine to merge this along with the new manual corrections table since it is not really code change @SashaWeinstein Unless it would cause some git issues down the line then I am all for a PR just for that. And just to confirm it is branched off main.

td928 commented 2 years ago

I am guessing it would be fine but if it would mess up the merge from dev to main for the future then I would definitely listen and try rebasing.

SashaWeinstein commented 2 years ago

I the best way forward is that the changes to the manual corrections should be merged from main to dev and this PR should be rebased

td928 commented 2 years ago

Or I will just point this branch to merge into main but also put this PR on hold until I have another PR for the manual correction work for dev

SashaWeinstein commented 2 years ago

Wait sorry I have another question on branching here: should this PR to point to main instead?

mbh329 commented 2 years ago

Took a look at the final_devdb table and the job_number and bbl match now, there are 5 records that do not match but I think that is coming directly from the DOB data

td928 commented 2 years ago

Wait sorry I have another question on branching here: should this PR to point to main instead?

For production yes, I was hoping to take a shortcut to merge into dev first then run the qaqc app. But if that seems too much corner cutting, I can make this point to main and start another branch to point to dev

td928 commented 2 years ago

switched to main for simplicity sake. I looked at the final qaqc table on local run and think we can skip the step to look at qaqc app.

SashaWeinstein commented 2 years ago

Hm yea I'm not quite sure what is best. My current thought is that we should merge this to main and then look at qaqc from separate feature branch we never merge

SashaWeinstein commented 2 years ago

I just don't want the the manual corrections to be updated in dev on this PR. So if we merged this to dev I think it would be better to keep the manual corrections untouched

SashaWeinstein commented 2 years ago

switched to main for simplicity sake. I looked at the final qaqc table on local run and think we can skip the step to look at qaqc app.

Ah sorry I didn't see this when I commented. Sounds good! Housing team will tell us if something is messed up anyway right