Open PScharrenberg opened 1 year ago
This could be occurred by Elasticsearch Sniffering feature.
How to enable this feature, see: https://github.com/uken/fluent-plugin-elasticsearch#sniffer-class-name
You probably hit this time bomb someone left for you: https://github.com/uken/fluent-plugin-elasticsearch#reload-after This causes the activation of the sniffer. Yes, a sniffer that hunts out the nodes in your ES cluster and then bypasses the configuration you explicitly set, thereby voiding any load balancing you may have configured. Bonus feature: it uses the scheme from the config you supplied to hit the host and port it finds in the nodes catalog.
I'd recommend reload_connections false
, as the sniffer just shouldn't be needed in any properly configured environment.
You'd either correctly configure the hosts it uses, or use a load balancer.
This "feature" should only be enabled if explicitly needed, which should be never.
IMHO the sniffer should exist as an optional plugin, and should be promptly removed/disabled.
Problem
fluent-plugin-elasticsearch successfully pushes logs to our elasticsearch server located behind a ssl-offloading nginx proxy listening on port 443. After a while (a few hours) no logs are transferred anymore and we find this warning-message in the fluentd logs (where X.X.X.X is the correct ip address of our es server):
So after a while it tries to connect to the elasticsearch server directly without proxy, which obviously does not work.
After restarting fluentd inside of the k8s pod (
fluent-ctl restart
) the logs are shipped againSteps to replicate
The relevant config part in fluentd.conf:
Expected Behavior or What you need to ask
We expect it to continue connecting to the configured port.
Using Fluentd and ES plugin versions
We're using the rancher-logging "app" provided by rancher (rancher-logging:100.1.3+up3.17.7) We're seeing this issue after upgrading from an older version.