Closed Nathan-Nesbitt closed 5 months ago
Hi @Nathan-Nesbitt, I have no idea about swarm - but would like to ask about some more info: This is an existing setup that you upgrade, right? And you want to connect to an existing Elasticsearch 7.10.2?
In this case, you should specifiy the connection string to elasticsearch in your docker-compose files (https://go2docs.graylog.org/5-2/setting_up_graylog/graylog_data_node_getting_started.htm?tocpath=Setting%20up%20Graylog%7CGraylog%20Data%20Node%7C_____1)
sth. like GRAYLOG_ELASTICSEARCH_HOSTS: "http://opensearch1:9200,http://opensearch2:9201,http://opensearch3:9202"
This should at least skip the "waiting for the initial setup" step - which in your case is not necessary.
@janheise great workaround thank you it's back up 💯
Looks like this is still a problem tho as I won't be able to redeploy on any new machines. Let me know how I can help diagnose / narrow this down and I'll do what I can to help!
@Nathan-Nesbitt - I'm happy that I was able to help. As I wrote, I have no experience with swarm. With redeploy, do you mean "set up a whole new cluster from scratch" or "adding new machines to an existing cluster". As long as the MongoDB stays intact (unless, as I said, you want to start from 0) adding new machines should work out fine.
A completely new setup should work fine, too, if you use plain OpenSearch/Elasticsearch. For the DataNode, I'd have to make some tests with swarm - but with some manual pre-configuration, it should work, too.
@janheise Totally, I am more concerned if I need to set this up on a whole new cluster!
What would be involved with the manual configuration? :)
@Nathan-Nesbitt Let me reiterate: if you use plain OpenSearch for new setups nothing changes except the elasticsearch_hosts
setting that is now mandatory.
First steps we undertook with the DataNode is simplifying the SSL configuration for your setups by adding a UI etc. You can also generate your own certificates and add them to the config manually and by doing so, skip the initial configuration, too. We'll probably support swarm installations etc. better in upcoming releases. 5.2 is our first release with the DataNode. I'll put "test with and support swarm" in our to-do list.
problem seems to be fixed
Seems that on first run in the swarm, the container does not serve up a positive healthcheck when you hit this point:
When I go to visit the URL I run into 2 issues:
Since it appears that the only way to set up the server is by using the generated auth to log in for the first time and set up the server, I imagine the health check should pass at this point so it avoids the previous 2 issues? Seems docker checks the health status a couple of times before killing it, and it fails each time:
This is the logs at the end:
Maybe it's a config issue? I'm not sure how the healthcheck is done, so I cannot go much further.
Expected Behavior
Graylog docker container should respond with a positive healthcheck when the server is waiting for the admin to set up the fresh installation.
Current Behavior
Server doesn't respond with a status, which when it hangs for an extended period of time in a swarm is automatically restarted under the assumption that the container is stalled/dead.
Possible Solution
Container should respond with healthy once we expose a port waiting for the client to set it up. As far as I can tell the client is working as expected, there are no errors in the log and I can run it outside of a swarm.
Steps to Reproduce (for bugs)
Context
Recently upgraded to a newer version of graylog, was using it for server logging on multiple applications within docker swarm. Cannot get the application going now so no logging :(
Your Environment
graylog --> portainer --> traefik