Open andreibrebene opened 4 weeks ago
I have the same problem with fluentd 1.17.1
and fluent-plugin-opensearch (1.1.5)
, both fluentd and opensearch running in Kubernetes. For me it happens every 2-3 hours, making the system effectively unusable.
Sample error line:
2024-11-05 11:21:42 +0000 [warn]: #0 failed to flush the buffer. retry_times=0 next_retry_time=2024-11-05 11:21:43 +0000 chunk="626289ac7f7285beba39bb9c87054dee" error_class=Fluent::Plugin::OpenSearchOutput::RecoverableRequestFailure
error="could not push logs to OpenSearch cluster ({:host=>\"opensearch-cluster.monitoring-platform.svc.cluster.local\", :port=>9200, :scheme=>\"https\", :user=>\"fluentduser\", :password=>\"obfuscated\", :path=>\"\"}): no address fo
r opensearch-cluster-warm-2 (Resolv::ResolvError)"
What I find interesting is that in the above error :host
designates the configured output destination (a Kubernetes Service) but no address for
mentions one of the opensearch nodes (a data only node in this example, but happens with all the nodes from time to time).
I have no idea why it would try to resolve an individual node (i.e. a pod).
@andreibrebene not sure if it's too early to celebrate but I seem to have mitigated the problem by setting:
reconnect_on_error true
in the output section.
I still see the above error in the log from time to time, but now it recovers immediately on the next try!
Steps to replicate
Provide example config and message Fluentd config:
Logs: fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 fluent/log.rb:383:warn: failed to flush the buffer. retry_times=475 next_retry_time=2024-10-29 09:05:00 +0000 chunk="6259344e524c5ea61ad251179f87a5b4" error_class=Fluent::Plugin::OpenSearchOutput::RecoverableRequestFailure error="could not push logs to OpenSearch cluster ({:host=>\"192.168.10.10\", :port=>9200, :scheme=>\"https\", :user=>\"admin\", :password=>\"obfuscated\"}): no address for opensearch-node (Resolv::ResolvError)" fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:1135:in
rescue in send_bulk' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:1097:in
send_bulk' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:908:inblock in write' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:907:in
each' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:907:inwrite' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:1225:in
try_flush' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:1538:inflush_thread_run' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:510:in
block (2 levels) in start' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin_helper/thread.rb:78:inblock in thread_create' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [debug]: #0 fluent/log.rb:341:debug: taking back chunk for errors. chunk="6259345bd61c3866dd7c45eaceee824a" fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [debug]: #0 fluent/log.rb:341:debug: taking back chunk for errors. chunk="625934532a531331f902d98dd2ccfd50" fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 fluent/log.rb:383:warn: failed to flush the buffer. retry_times=475 next_retry_time=2024-10-29 09:04:54 +0000 chunk="6259345bd61c3866dd7c45eaceee824a" error_class=Fluent::Plugin::OpenSearchOutput::RecoverableRequestFailure error="could not push logs to OpenSearch cluster ({:host=>\"192.168.10.10\", :port=>9200, :scheme=>\"https\", :user=>\"admin\", :password=>\"obfuscated\"}): no address for opensearch-node (Resolv::ResolvError)" fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:1135:in
rescue in send_bulk' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:1097:insend_bulk' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:908:in
block in write' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:907:ineach' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:907:in
write' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:1225:intry_flush' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:1538:in
flush_thread_run' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:510:inblock (2 levels) in start' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin_helper/thread.rb:78:in
block in thread_create' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 fluent/log.rb:383:warn: failed to flush the buffer. retry_times=475 next_retry_time=2024-10-29 09:05:02 +0000 chunk="625934532a531331f902d98dd2ccfd50" error_class=Fluent::Plugin::OpenSearchOutput::RecoverableRequestFailure error="could not push logs to OpenSearch cluster ({:host=>\"192.168.10.10\", :port=>9200, :scheme=>\"https\", :user=>\"admin\", :password=>\"obfuscated\"}): no address for opensearch-node (Resolv::ResolvError)" fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:1135:inrescue in send_bulk' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:1097:in
send_bulk' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:908:inblock in write' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:907:in
each' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:907:inwrite' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:1225:in
try_flush' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:1538:inflush_thread_run' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:510:in
block (2 levels) in start' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [debug]: #0 fluent/log.rb:341:debug: taking back chunk for errors. chunk="625934524b3a23f460de0b179a356bc5" fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 fluent/log.rb:383:warn: failed to flush the buffer. retry_times=475 next_retry_time=2024-10-29 09:05:02 +0000 chunk="625934524b3a23f460de0b179a356bc5" error_class=Fluent::Plugin::OpenSearchOutput::RecoverableRequestFailure error="could not push logs to OpenSearch cluster ({:host=>\"192.168.10.10\", :port=>9200, :scheme=>\"https\", :user=>\"admin\", :password=>\"obfuscated\"}): no address for opensearch-node (Resolv::ResolvError)"Expected Behavior or What you need to ask
Fluentd works for 10-12 hours, after this, will give you this error. The opensearch-node and address can be accessed via fluentd pod, everything works. If i restart the fluentd pods, will start work again and come back after some hours.
Using Fluentd and OpenSearch plugin versions
OS version -> Debian GNU/Linux 12 (bookworm)
Bare Metal or within Docker or Kubernetes or others? -> Kubernetes Cluster
Fluentd v1.0 or later -> fluentd 1.17.1
OpenSearch plugin version -> fluent-plugin-opensearch (1.1.4)
paste boot log of fluentd or td-agent abbrev (default: 0.1.1) addressable (2.8.7) aws-eventstream (1.3.0) aws-partitions (1.965.0) aws-sdk-core (3.201.5) aws-sigv4 (1.9.1) base64 (0.2.0, default: 0.1.1) benchmark (default: 0.2.1) bigdecimal (default: 3.1.3) bundler (default: 2.4.19, 2.4.17) cgi (default: 0.3.6) concurrent-ruby (1.3.4) cool.io (1.8.1) csv (3.3.0, default: 3.2.6) date (default: 3.3.3) delegate (default: 0.3.0) did_you_mean (default: 1.6.3) digest (default: 3.1.1) domain_name (0.6.20240107) drb (2.2.1, default: 2.1.1) english (default: 0.7.2) erb (default: 4.0.2) error_highlight (default: 0.5.1) etc (default: 1.4.2) excon (0.111.0) faraday (2.10.1) faraday-excon (2.1.0) faraday-net_http (3.1.1) faraday_middleware-aws-sigv4 (1.0.1) fcntl (default: 1.0.2) ffi (1.17.0 x86_64-linux-gnu) ffi-compiler (1.3.2) fiddle (default: 1.1.1) fileutils (default: 1.7.0) find (default: 0.1.1) fluent-config-regexp-type (1.0.0) fluent-plugin-concat (2.5.0) fluent-plugin-detect-exceptions (0.0.15) fluent-plugin-grok-parser (2.6.2) fluent-plugin-json-in-json-2 (1.0.2) fluent-plugin-kubernetes_metadata_filter (3.5.0) fluent-plugin-multi-format-parser (1.0.0) fluent-plugin-opensearch (1.1.4) fluent-plugin-parser-cri (0.1.1) fluent-plugin-prometheus (2.1.0) fluent-plugin-record-modifier (2.1.1) fluent-plugin-rewrite-tag-filter (2.4.0) fluent-plugin-systemd (1.0.5) fluentd (1.17.1) forwardable (default: 1.3.3) getoptlong (default: 0.2.0) http (5.2.0) http-accept (1.7.0) http-cookie (1.0.7) http-form_data (2.3.0) http_parser.rb (0.8.0) io-console (default: 0.6.0) io-nonblock (default: 0.2.0) io-wait (default: 0.3.0) ipaddr (default: 1.2.5) irb (default: 1.6.2) jmespath (1.6.2) json (default: 2.6.3) jsonpath (1.1.5) kubeclient (4.12.0) llhttp-ffi (0.5.0) logger (1.6.0, default: 1.5.3) lru_redux (1.1.0) mime-types (3.5.2) mime-types-data (3.2024.0806) msgpack (1.7.2) multi_json (1.15.0) mutex_m (default: 0.1.2) net-http (default: 0.4.1) net-protocol (default: 0.2.1) netrc (0.11.0) nkf (default: 0.1.2) observer (default: 0.1.1) oj (3.15.1) open-uri (default: 0.3.0) open3 (default: 0.1.2) opensearch-ruby (3.4.0) openssl (default: 3.1.0) optparse (default: 0.3.1) ostruct (default: 0.5.5) pathname (default: 0.2.1) pp (default: 0.4.0) prettyprint (default: 0.1.1) prometheus-client (4.2.3) pstore (default: 0.1.2) psych (default: 5.0.1) public_suffix (6.0.1) racc (default: 1.6.2) rake (13.2.1) rdoc (default: 6.5.1.1) readline (default: 0.0.3) readline-ext (default: 0.1.5) recursive-open-struct (1.2.2) reline (default: 0.3.2) resolv (default: 0.2.2) resolv-replace (default: 0.1.1) rest-client (2.1.0) rexml (3.2.9) rinda (default: 0.1.1) ruby2_keywords (default: 0.0.5) securerandom (default: 0.2.2) serverengine (2.3.2) set (default: 1.0.3) shellwords (default: 0.1.0) sigdump (0.2.5) singleton (default: 0.1.1) stringio (default: 3.0.4) strptime (0.2.5) strscan (3.1.0, default: 3.0.5) syntax_suggest (default: 1.1.0) syslog (default: 0.1.1) systemd-journal (1.4.2) tempfile (default: 0.1.3) time (default: 0.2.2) timeout (default: 0.3.1) tmpdir (default: 0.1.3) tsort (default: 0.1.1) tzinfo (2.0.6) tzinfo-data (1.2024.1) un (default: 0.2.1) uri (0.13.0, default: 0.12.2) weakref (default: 0.1.2) webrick (1.8.1) yajl-ruby (1.4.3) yaml (default: 0.2.1) zlib (default: 3.0.0)
OpenSearch version -> v 2.11.1