T47 elasticsearch upgrade

lwrubel commented 3 years ago

To set up:

After checking out the branch on each VM, move/rename existing docker-compose.yml and .env files
Copy the example versions of the appropriate docker-compose and env files onto the primary node and cluster node(s) and customize with hostnames, IPs, master status. There must be two master-eligible nodes, so set MASTER=true on both.
Edit the docker-compose.yml file on the primary node to build (not pull image) the server and loader images.
Remove any existing images for the loader and server images on the primary node: docker rmi ts_server ts_flaskrun ts_loader docker rmi gwul/tweetsets-server gwul/tweetsets-flaskrun gwul/tweetsets-loader
docker-compose up -d --build

Test loading a dataset as part of the review.

lwrubel commented 3 years ago

I am noticing that there are still some upgrade steps required in order for the spark container to work, including upgrading the elasticsearch-hadoop-6.2.2.jar that's in this repo and referenced in Dockerfile-loader. Will push an update to this branch, but please review other functionality in the meantime.

dolsysmith commented 3 years ago

Datasets successfully loaded with both regular loader and spark-loader. UI working as expected.

gwu-libraries / TweetSets

T47 elasticsearch upgrade #62