splunk / docker-splunk

Splunk Docker GitHub Repository
459 stars 253 forks source link

Splunk Search Heads fail to start in Docker Swarm #534

Open rskntroot opened 2 years ago

rskntroot commented 2 years ago

Issue Description:

So I have been experimenting with docker swarm and run into an issue where splunk containers with role: search_head or search_head captain fail to start in a docker swarm environment.

Project Codebase:

https://github.com/rskntroot/splunk

NOTE: I understand that splunk in docker swarm is unsupported for a reason.

NOTE: I have managed to get a 3x [search_head] 1x [deployer] 1x [indexer] setup to work fully in docker swarm with the following workaround

Work Around: **

During testing of the workaround I have found that:

Conclusion:

Pre workaround: I was able to docker exec into a search container and unable to connect to the other search nodes. No issues with connecting to the deployer or indexer.

It seems that the splunk-ansible configurations do not put the container in a state where docker swarm will publish the containers IP to docker DNS.

I'm at wits-end on this one and was wondering if anyone wants to give me some pointers on how to create an ansible playbook for this case 🤷🏻‍♂️ (setting docker state is handled in entrypoint.sh)

rskntroot commented 2 years ago

Closing issue as it is not related to splunk's docker image configuration. Opened issue in splunk-ansible here: https://github.com/splunk/splunk-ansible/issues/672

rskntroot commented 2 years ago

Reopening as splunk-ansible commands do not impact the status of the container. The docker service does not publish the container in DNS until the container is "healthy".

After taking a look at the entrypoint.sh a little further, it seems at the issue can be resolved in this file. I was able to resolve the issue with docker DNS by setting the container status to started upon the completion of prep_ansible. This is obv not idea.

Recommendation: Split setup into two phases for (common) and (splunk_role). After common setup phase completes set container as healthy

ansible-playbook < splunk common phase >
echo "started" ${CONTAINER_ARTIFACT_DIR}/splunk-container.state
ansible-playbook < splunk role pase >