gwu-libraries / TweetSets

Service for creating Twitter datasets for research and archiving.
MIT License
26 stars 2 forks source link

Upgrade to Python 3.8 #126

Closed lwrubel closed 3 years ago

lwrubel commented 3 years ago

Use python:3.8 docker image and migrate app to Python 3.8.

lwrubel commented 3 years ago

It is possible to run the loader container on a different version of Python if desired.

lwrubel commented 3 years ago

Look at existing redis-py and celery versions to confirm whether they can be used with python 3.8 before then upgrading those dependencies.

lwrubel commented 3 years ago

To migrate to Python 3.7+, need at least Celery 4.3. Upgrading docs. This also will upgrade dependencies: kombu, billiard>=3.6., amqp, python-dateutil, vine. Kombu upgrade required because its async module needs to be renamed to asynchronous with the introduction of Python keyword async in 3.7.

Also required is redis-py>=3.2.0. Redis py 3.X release notes. Version 3 has breaking changes.

Will need to upgrade Python in the loader container and spark-master containers later, since 1) Spark 2.4 will not work with Python 3.8 and requires a major version upgrade to 3.x, and 2) openjdk-8-jdk is not available for a 3.8-buster base image without adding further repositories. I'm assuming that having different versions of Java in the loader and spark containers would be problematic.

dolsysmith commented 3 years ago

I've

lwrubel commented 3 years ago

Was this supposed to be closed? Re-opening since it has an open PR #130.