fluent / fluentd

Fluentd: Unified Logging Layer (project under CNCF)
https://www.fluentd.org
Apache License 2.0
12.81k stars 1.34k forks source link

Unexpected error raised. Stopping the timer. title=:in_tail_close_watcher error_class=NoMethodError error="undefined method `eof?' for nil:NilClass #3499

Closed anand3493 closed 2 years ago

anand3493 commented 2 years ago

Describe the bug

We are using EFK stack for logging with our AWS EKS cluster. We have recently upgrade the fluentd version to 1.13.3 and found this error in fluentd pod logs.

[error]: #0 [in_tail_container_logs] Unexpected error raised. Stopping the timer. title=:in_tail_close_watcher error_class=NoMethodError error="undefined method `eof?' for nil:NilClass"

To Reproduce

The logs are flowing but given a period of time, the mentioned issue is creeping up badly.

Expected behavior

NO error

Your Environment

- Fluentd version: v1.13.3
- Operating system: Debian GNU/Linux
- Kernel version:5.4.117-58.216.amzn2.x86_64

Your Configuration

<source>
      @type tail
      @id in_tail_container_logs
      path /var/log/containers/*.log
      pos_file /var/log/fluentd-containers.log.pos
      tag kubernetes.*
      read_from_head true
      follow_inodes true
      pos_file_compaction_interval 24h
      read_bytes_limit_per_second 8192
      <parse>
        @type "#{ENV['FLUENT_CONTAINER_TAIL_PARSER_TYPE'] || 'json'}"
        time_format %Y-%m-%dT%H:%M:%S.%NZ
      </parse>
</source>

<match **>
   @type aws-elasticsearch-service
   @id out_aws_es
   @log_level "#{ENV['FLUENT_LOG_LEVEL'] || 'info'}"
   include_tag_key true
   <endpoint>
     url "#{ENV['FLUENT_AWS_ELASTICSEARCH_ENDPOINT']}"
     region "#{ENV['FLUENT_AWS_ELASTICSEARCH_REGION']}"
   </endpoint>
   reload_connections "#{ENV['FLUENT_ELASTICSEARCH_RELOAD_CONNECTIONS'] || 'false'}"
   reconnect_on_error "#{ENV['FLUENT_ELASTICSEARCH_RECONNECT_ON_ERROR'] || 'true'}"
   reload_on_failure "#{ENV['FLUENT_ELASTICSEARCH_RELOAD_ON_FAILURE'] || 'true'}"
   logstash_prefix "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_PREFIX'] || 'logstash-apiplatform'}"
   logstash_format "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_FORMAT'] || 'true'}"
   index_name "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_INDEX_NAME'] || 'logstash-apiplatform'}"
   type_name "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_TYPE_NAME'] || 'fluentd'}"
   <buffer>
     @type file
     path /fluentd/log/elastic-buffer
     flush_mode interval 
     flush_thread_count "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_FLUSH_THREAD_COUNT'] || '8'}"
     flush_interval "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_FLUSH_INTERVAL'] || '5s'}"
     chunk_limit_size "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_CHUNK_LIMIT_SIZE'] || '16M'}"
     retry_max_interval "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_RETRY_MAX_INTERVAL'] || '30'}"
     retry_timeout 1h
   </buffer>
</match>

Your Error Log

2021-09-08 11:57:05 +0000 [error]: #0 [in_tail_container_logs] Unexpected error raised. Stopping the timer. title=:in_tail_close_watcher error_class=NoMethodError error="undefined method `eof?' for nil:NilClass"
  2021-09-08 11:57:05 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.3/lib/fluent/plugin/in_tail.rb:765:in `eof?'
  2021-09-08 11:57:05 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.3/lib/fluent/plugin/in_tail.rb:547:in `block in detach_watcher_after_rotate_wait'
  2021-09-08 11:57:05 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.3/lib/fluent/plugin_helper/timer.rb:80:in `on_timer'
  2021-09-08 11:57:05 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/2.6.0/gems/cool.io-1.7.1/lib/cool.io/loop.rb:88:in `run_once'
  2021-09-08 11:57:05 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/2.6.0/gems/cool.io-1.7.1/lib/cool.io/loop.rb:88:in `run'
  2021-09-08 11:57:05 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.3/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start'
  2021-09-08 11:57:05 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.3/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2021-09-08 11:57:05 +0000 [error]: #0 [in_tail_container_logs] Timer detached. title=:in_tail_close_watcher

Additional context

No response

anand3493 commented 2 years ago

@ashie Can u plz share the impact because of this error ? Also may I know in which release we can expect this fix?

anand3493 commented 2 years ago

@ashie Thanks for fixing this issue. Plz share the impact and ETA for this fix

ashie commented 2 years ago

Thanks for your report! This error will be reproduced only when read_bytes_limit_per_second is specified. It's occurred after detecting log rotation and waiting remaining data which might be written by other application. Probably log lost or duplication won't be occurred. It will be fixed at v1.14.1 (will be released at the end of Sep.)

anand3493 commented 2 years ago

@ashie Do you have any word on the v1.14.1 release date?

kenhys commented 2 years ago

It will be released end of this month.

anand3493 commented 2 years ago

@ashie @kenhys I have upgraded the fluentd to v1.14.1 .. On doing so when the new pod is getting started , facing the below error: image

can u plz advise. Thanks in advance