HariSekhon / Nagios-Plugins

450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...
https://www.linkedin.com/in/HariSekhon
Other
1.14k stars 506 forks source link

check_elasticsearch_*: feature request: Support multiple hosts #359

Open hansbogert opened 3 years ago

hansbogert commented 3 years ago

The elasticsearch plugins, do not allow multiple -H parameters. The plugins should be able to do a fallback on a list of elasticsearch nodes and only give an error if all nodes are down, and otherwise adhere to the semantics of the elastic API. This would make the most sense with a clustered solution like elasticsearch imho, since we are interested in cluster-wide stats and not node-specific stats (there are exceptions of course).

Would this proposed functionality be merged if it were implemented, or are there objections?

HariSekhon commented 3 years ago

I had considered this a few years ago for not just this but other cluster plugins as well.

It seems simpler and safer to abstract this cluster connectivity functionality, so I've written 2 ways to do that here in the readme and provided all the code and configuration links there too:

https://github.com/HariSekhon/Nagios-Plugins#high-availability--multi-master-testing

sharkyzz commented 3 years ago

why find_active_server.py doesn't support username and password?

HariSekhon commented 3 years ago

@sharkyzz good question - find_active_server.py was designed to be fast (hence multi-threaded) and if you look at all the use cases of the subclassed adjacent findactive*.py programs (including find_active_elasticsearch.py) it wasn't needed for any of those use cases to find the active master or an alive peer in Elasticsearch's case... but I guess I could have switched the check_http method to use some of my other pylib components to support basic HTTP auth + kerberos auth auth as most of my other programs do.

This isn't something I'm working on right now due to other work priorities but you're welcome to submit a patch for it.

hansbogert commented 3 years ago

Well there is an advantage to having it self-contained in the plugins. Responding to the 2 methods in your referenced documentation:

1) using a proxy; yet another system to configure and monitor and is simply not how elastic works. Elasticsearch clients in general are cluster aware. 2) using shell subprocesses is smart; but I don't think this would work easily in a nagios and/or Icinga use-case. And as your points above with @sharkyzz this leads to all kinds of feature creep.

HariSekhon commented 3 years ago

@hansbogert valid points.

There is already a find_active_elasticsearch.py and HAProxy elasticsearch.cfg supplied which generalizes this use case, both are super simple to use within a few minutes.

I had intended to use the elasticsearch python library a couple times during my days with Elasticsearch 2.x and 5.x (it's still commented out in requirements.txt) which would have directly supported multiple peer seeds instead of using direct Rest API calls but that led to even more problems:

I'm don't mind reviewing and accepting Pull Requests if you want to fork different versions of these plugins (under new names to not impact anybody using these versions), especially if switching to use the Elasticsearch python library but I right now I don't currently use Elasticsearch any more so am not actively developing new functionality as I solved all my use cases at the time with the existing code.