Closed lukewendling closed 7 years ago
Use Spark 2.0 docker containers (1 master/ 1 slave) to encapsulate a simple py script that loads data directly into mongo (loopy not a good candidate) for ingesting fairly large datasets (> 10M posts).
I looked over what you had, please commit and we should be good.
accidentally merged to 88 branch after it was itself merged to another. let's leave it in 88 branch for later use, if needed.
Use Spark 2.0 docker containers (1 master/ 1 slave) to encapsulate a simple py script that loads data directly into mongo (loopy not a good candidate) for ingesting fairly large datasets (> 10M posts).