fluent / fluent-plugin-opensearch

OpenSearch Plugin for Fluentd
Apache License 2.0
58 stars 20 forks source link

no address for opensearch (Resolv::ResolvError) #147

Open andreibrebene opened 4 weeks ago

andreibrebene commented 4 weeks ago

Steps to replicate

Provide example config and message Fluentd config:

log_level debug
<source>
  @type tail
  path /var/log/containers/*.log
  exclude_path /var/log/containers/fluentd*.log
  tag kubernetes.*
  pos_file /var/log/fluentd.pos
  read_from_head true

  <parse>
    @type cri
    time_format %Y-%m-%dT%H:%M:%S.%L%z
  </parse>
</source>

<filter kubernetes.**>
  @type kubernetes_metadata
</filter>

<filter kubernetes.wazuh>
  @type concat
  key message
  multiline_start_regexp /^\[\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2},\d{3}\]|\berror\b/
  flush_interval 120s
</filter>

<filter kubernetes.icsenrich>
  @type concat
  key message
  multiline_start_regexp /^\[\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2},\d{3}\]|\d{4}-\d{2}-\d{2}/
  flush_interval 120s
</filter>

<filter kubernetes.**>
  @type record_transformer
  remove_keys $['kubernetes']['labels'], $["kubernetes"]["pod_id"], $["kubernetes"]["container_image_id"], $["kubernetes"]["container_image"], $["kubernetes"]["master_url"], $["docker"]["container_id"], $["kubernetes"]["namespace_id"]
</filter>

<match kubernetes.**>
  @type opensearch
  hosts "#{ENV['ELASTICSEARCH_HOST']}"
  scheme https
  user "#{ENV['ELASTICSEARCH_USERNAME']}"
  password "#{ENV['ELASTICSEARCH_PASSWORD']}"
  ssl_verify false
  logstash_format true
  request_timeout 60s
  include_tag_key true
  logstash_prefix "logstash-${tag}"
  logstash_dateformat "%Y.%m.%d"

  <buffer>
    @type file
    path /var/log/fluentd-buffers/kubernetes
    chunk_limit_size 64m
    total_limit_size 2048m  
    flush_interval 10s      
    retry_max_interval 60s
    retry_forever true    
    flush_thread_count 10
    overflow_action block
  </buffer>
</match>

Logs: fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 fluent/log.rb:383:warn: failed to flush the buffer. retry_times=475 next_retry_time=2024-10-29 09:05:00 +0000 chunk="6259344e524c5ea61ad251179f87a5b4" error_class=Fluent::Plugin::OpenSearchOutput::RecoverableRequestFailure error="could not push logs to OpenSearch cluster ({:host=>\"192.168.10.10\", :port=>9200, :scheme=>\"https\", :user=>\"admin\", :password=>\"obfuscated\"}): no address for opensearch-node (Resolv::ResolvError)" fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:1135:in rescue in send_bulk' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:1097:insend_bulk' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:908:in block in write' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:907:ineach' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:907:in write' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:1225:intry_flush' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:1538:in flush_thread_run' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:510:inblock (2 levels) in start' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin_helper/thread.rb:78:in block in thread_create' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [debug]: #0 fluent/log.rb:341:debug: taking back chunk for errors. chunk="6259345bd61c3866dd7c45eaceee824a" fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [debug]: #0 fluent/log.rb:341:debug: taking back chunk for errors. chunk="625934532a531331f902d98dd2ccfd50" fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 fluent/log.rb:383:warn: failed to flush the buffer. retry_times=475 next_retry_time=2024-10-29 09:04:54 +0000 chunk="6259345bd61c3866dd7c45eaceee824a" error_class=Fluent::Plugin::OpenSearchOutput::RecoverableRequestFailure error="could not push logs to OpenSearch cluster ({:host=>\"192.168.10.10\", :port=>9200, :scheme=>\"https\", :user=>\"admin\", :password=>\"obfuscated\"}): no address for opensearch-node (Resolv::ResolvError)" fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:1135:inrescue in send_bulk' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:1097:in send_bulk' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:908:inblock in write' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:907:in each' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:907:inwrite' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:1225:in try_flush' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:1538:inflush_thread_run' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:510:in block (2 levels) in start' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin_helper/thread.rb:78:inblock in thread_create' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 fluent/log.rb:383:warn: failed to flush the buffer. retry_times=475 next_retry_time=2024-10-29 09:05:02 +0000 chunk="625934532a531331f902d98dd2ccfd50" error_class=Fluent::Plugin::OpenSearchOutput::RecoverableRequestFailure error="could not push logs to OpenSearch cluster ({:host=>\"192.168.10.10\", :port=>9200, :scheme=>\"https\", :user=>\"admin\", :password=>\"obfuscated\"}): no address for opensearch-node (Resolv::ResolvError)" fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:1135:in rescue in send_bulk' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:1097:insend_bulk' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:908:in block in write' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:907:ineach' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:907:in write' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:1225:intry_flush' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:1538:in flush_thread_run' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:510:inblock (2 levels) in start' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [debug]: #0 fluent/log.rb:341:debug: taking back chunk for errors. chunk="625934524b3a23f460de0b179a356bc5" fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 fluent/log.rb:383:warn: failed to flush the buffer. retry_times=475 next_retry_time=2024-10-29 09:05:02 +0000 chunk="625934524b3a23f460de0b179a356bc5" error_class=Fluent::Plugin::OpenSearchOutput::RecoverableRequestFailure error="could not push logs to OpenSearch cluster ({:host=>\"192.168.10.10\", :port=>9200, :scheme=>\"https\", :user=>\"admin\", :password=>\"obfuscated\"}): no address for opensearch-node (Resolv::ResolvError)"

Expected Behavior or What you need to ask

Fluentd works for 10-12 hours, after this, will give you this error. The opensearch-node and address can be accessed via fluentd pod, everything works. If i restart the fluentd pods, will start work again and come back after some hours.

Using Fluentd and OpenSearch plugin versions

rwunderer commented 2 weeks ago

I have the same problem with fluentd 1.17.1 and fluent-plugin-opensearch (1.1.5), both fluentd and opensearch running in Kubernetes. For me it happens every 2-3 hours, making the system effectively unusable.

Sample error line:

2024-11-05 11:21:42 +0000 [warn]: #0 failed to flush the buffer. retry_times=0 next_retry_time=2024-11-05 11:21:43 +0000 chunk="626289ac7f7285beba39bb9c87054dee" error_class=Fluent::Plugin::OpenSearchOutput::RecoverableRequestFailure
 error="could not push logs to OpenSearch cluster ({:host=>\"opensearch-cluster.monitoring-platform.svc.cluster.local\", :port=>9200, :scheme=>\"https\", :user=>\"fluentduser\", :password=>\"obfuscated\", :path=>\"\"}): no address fo
r opensearch-cluster-warm-2 (Resolv::ResolvError)"

What I find interesting is that in the above error :host designates the configured output destination (a Kubernetes Service) but no address for mentions one of the opensearch nodes (a data only node in this example, but happens with all the nodes from time to time). I have no idea why it would try to resolve an individual node (i.e. a pod).

rwunderer commented 2 weeks ago

@andreibrebene not sure if it's too early to celebrate but I seem to have mitigated the problem by setting:

reconnect_on_error true

in the output section.

I still see the above error in the log from time to time, but now it recovers immediately on the next try!