gwu-libraries / TweetSets

Service for creating Twitter datasets for research and archiving.
MIT License
25 stars 2 forks source link

Update example.docker-compose.yml, example.env, and README #95

Closed kerchner closed 2 years ago

kerchner commented 3 years ago

Fix inconsistent folder names, and lines which should be commented out.

kerchner commented 3 years ago

README questions:

gwtweetsets-dev1/2 only has 1 cluster node (aside from primary) and there's no issue.

kerchner commented 3 years ago

Moving this out of 2.1

dolsysmith commented 3 years ago

If we add this line to loader.docker-compose.yml, we can use the Spark UI to monitor job performance.

ports:
     - 4040:4040

Note that this is different from the SPARK_UI_PORT referenced in Dockerfile-loader. The latter has information specifically about the Spark node, but in my experience, the more useful information is provided on 4040. SSH-tunneling provides a good way to access this port without exposing it on prod.

kerchner commented 2 years ago
ports:
     - 4040:4040

was added as part of https://github.com/gwu-libraries/TweetSets/pull/153