Closed SashaWeinstein closed 1 year ago
Ok yes I see, I'll change the geocode step to call the python code directly instead of the creating another container
Ok yes I see, I'll change the geocode step to call the python code directly instead of the creating another container
can we also move the entire geocode function to the devsh.sh
? It seems to me more natural to be there
Sure, that's a good idea I can make that change
Also want to highlight that currently the way we call Python seems to be using package came with the docker images which installed with the requirements.txt. I believe if we want to use the local poetry environment now specified by the lock in this devcontainer we needs to do something like
poetry run python -m geocode
yep, working through this now
To test I:
poetry install
and activated the poetry shell
closing as redundant per mono repo
Medium PR, should have multiple reviewers 🐳 ✍️ This addresses two upgrades, #563 and #576. We need poetry to install the right packages on the devcontainer, and it's not good that we've been working this whole time with no common dependency management.
Devcontainer upgrade
A combination of facDB and EDDE devcontainer setups. The first choice I had to make was whether to build off the vscode image and download geosupport in the Dockerfile or build off the geosupport image. I chose the geosupport image as it seemed easiest. Then I added the docker compose functionality because we need a database and a geosupport container to run the pipeline
Poetry upgrade
I found these two blog posts to quickly initialize the poetry files with all the packages we use -Build requirements.txt with pipreqs -Add to poetry fro requirements.txt
Future Work
Think it should be pretty easy to make the
Data Sync
workflow obsolete. It's a ton of code to pull down datasets from data library, run the geocoding, export the tables to .csvs, and upload back up to data library. I was able to produce the geocoded tables after just running dataloading. I wrote up this improvement in #589