gwu-libraries / TweetSets

Service for creating Twitter datasets for research and archiving.
MIT License
25 stars 2 forks source link

Move Python dependencies to setup.py #110

Open dolsysmith opened 3 years ago

dolsysmith commented 3 years ago

Currently, all dependencies in requirements.txt are installed for all containers. But not all of them are needed in each container. pyspark in particular, which is a large library (~250 MB), is used only in the loader container.

Using setup.py to manage installation of requirements, instead of doing pip install -r requirements.txt in each container, would allow greater flexibility, so that each container could install only the reqs it needs. (This would mainly benefit development, though it would also save space on prod.)