Closed kerchner closed 2 years ago
README questions:
Set vm_max_map_count as described in the ElasticSearch documentation
be implemented on both nodes of the cluster?Is this still true?
Clusters must have at least a primary node and two additional nodes.
Edit .env. This file is annotated to help you select appropriate values. Note that 2 cluster nodes must have MASTER set to true.
gwtweetsets-dev1/2 only has 1 cluster node (aside from primary) and there's no issue.
Moving this out of 2.1
If we add this line to loader.docker-compose.yml
, we can use the Spark UI to monitor job performance.
ports:
- 4040:4040
Note that this is different from the SPARK_UI_PORT
referenced in Dockerfile-loader
. The latter has information specifically about the Spark node, but in my experience, the more useful information is provided on 4040
. SSH-tunneling provides a good way to access this port without exposing it on prod.
ports:
- 4040:4040
was added as part of https://github.com/gwu-libraries/TweetSets/pull/153
Fix inconsistent folder names, and lines which should be commented out.