deployment-gap-model-education-fund / deployment-gap-model

ETL code for the Deployment Gap Model Education Fund
https://www.deploymentgap.fund/
MIT License
6 stars 2 forks source link

Add implicit dependencies to requirements.txt #74

Open TrentonBush opened 2 years ago

TrentonBush commented 2 years ago

Docker builds can get out of sync between users, likely because of caching pip packages during the build process. This should be resolvable by pinning the dependencies.

Which ones to pin? I think we want to pin anything we use directly, even if the dependency is technically satisfied through another package. The biggest example of this is pandas being implicitly installed via dependency on catalystcoop.pudl. I believe those should be made explicit in requirements.txt.

Below is a list of those implicitly installed but directly used packages. I compiled this list by running find . -name '*.py' -exec grep "import" {} \; | sort | uniq and manually looking for packages from outside the standard library and not already in requirements.txt.

By this method, the final package list for requirements.txt would have:

TrentonBush commented 2 years ago

Oops, I didn't see your PR #73