Closed td928 closed 2 years ago
Great. That makes more sense now. It looks good to me.
no emoji 😞
no emoji 😞
major misstep by me. To make it up with some flare 🐉 one reviewer is fine.
Nice emoji! Ok so I took at look at the other qaqc scripts we could move into the new bash script.
MID_devdb
. No scripts called after these two change MID_devdb
_INIT_devdb
. This table is only subsequently referenced in _geo.sql
, and it's not changed thereUNITS_devdb
which is referenced in _mid.sql
but not changed thereSo my conclusion is that all these scripts can be run after build is completed. To be totally extra sure we should run the qaqc before, make the change, and then run the qaqc again but I wanted to share my thought process to get confirmation before spending the time
My assessment is the same as you @SashaWeinstein. I agree with your test to make sure the "in between" steps did not impact the qaqc as well. How are you comparing the results though? I guess you can substract two mid_qaqc
from each other to see if everything comes out zero as the dataframe is entirely binary. Is this what you are thinking as well?
I guess pandas how this nice function comparing df. Going to test this now.
Awesome! I had not thought of a good way to test, was honestly probably just going to eyeball but doing it programmatically is definitely better. I use psycopg2 to pull tables down from postgres and load them into pandas if you're looking for a nice way
a conversation with @SashaWeinstein set me right and run tests should be all good. I attached the jupyter notebook which I run the test with if anyone want to replicate the tests on their side. Thanks all! qaqc_compare.ipynb.zip
519
Overview
To make the process more iterative and simpler for testing the development of qaqc app. You probably notice that not all the qaqc steps are included in the new
03_qaqc.sh
e.g.qaqc_mid.sql
. My thought process is that we really talking two quite different parts of qaqc one is the report generated for the housing research team to do manual research and the other part which we take to generate the application for our team. So I think it is okay to only include specific qaqc steps we are using for the application and If it could be explicit I might change the name to03_qaqc_de.sh
.impact
some upstream and downstream impact are also needed namely in the yml and also
devdb.sh
to incorporate call for the new shell script.