repeatedly / fluent-plugin-beats

Fluentd plugin for Elastic beats
Apache License 2.0
45 stars 12 forks source link

unexpected error error="Resource temporarily unavailable" #17

Open n-j-91 opened 5 years ago

n-j-91 commented 5 years ago

I have multiple filebeat instances to send logs to td agent. Once i enable fluent-plugin-beats with following configuration, i observe a unexpected error error="Resource temporarily unavailable" in td-agent log. ruby process goes to 100% CPU utilization and i see huge number of CLOSE_WAIT connections piling up on fluentd server.

I am using fluent-plugin-beats (1.0.0) on fluentd (1.2.2)

2018-12-20 09:02:37 +0000 [error]: #0 unexpected error error="Resource temporarily unavailable"
  2018-12-20 09:02:37 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-beats-1.0.0/lib/lumberjack/beats/server.rb:343:in `sysread'
  2018-12-20 09:02:37 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-beats-1.0.0/lib/lumberjack/beats/server.rb:343:in `read_socket'
  2018-12-20 09:02:37 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-beats-1.0.0/lib/lumberjack/beats/server.rb:320:in `run'
  2018-12-20 09:02:37 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-beats-1.0.0/lib/fluent/plugin/in_beats.rb:109:in `block in run'
  2018-12-20 09:02:37 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/concurrent-ruby-1.1.4/lib/concurrent/executor/ruby_thread_pool_executor.rb:348:in `run_task'
  2018-12-20 09:02:37 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/concurrent-ruby-1.1.4/lib/concurrent/executor/ruby_thread_pool_executor.rb:337:in `block (3 levels) in create_worker'
  2018-12-20 09:02:37 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/concurrent-ruby-1.1.4/lib/concurrent/executor/ruby_thread_pool_executor.rb:320:in `loop'
  2018-12-20 09:02:37 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/concurrent-ruby-1.1.4/lib/concurrent/executor/ruby_thread_pool_executor.rb:320:in `block (2 levels) in create_worker'
  2018-12-20 09:02:37 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/concurrent-ruby-1.1.4/lib/concurrent/executor/ruby_thread_pool_executor.rb:319:in `catch'
  2018-12-20 09:02:37 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/concurrent-ruby-1.1.4/lib/concurrent/executor/ruby_thread_pool_executor.rb:319:in `block in create_worker'

Here is my td-agent config, and flilebeat configs;

<source>
  @type beats
  tag beats
  port 55445
  max_connections 350
</source>
filebeat.prospectors:

# Each - is a prospector. Most options can be set at the prospector level, so
# you can use different prospectors for various configurations.
# Below are the prospector specific configurations.

- type: log

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - Z:\IISLogs\*\*
  document_type: iis
  exclude_lines: ["^#"]
  exclude_lines: ['.*OPTIONS.*']
  fields_under_root: true
  fields:
    severity: info

#----------------------------- Logstash output --------------------------------
output.logstash:
  # The Logstash hosts
  hosts: ["logstore.url:55445"]
  index: "logstash-%{+yyyy.MM.dd}" 
n-j-91 commented 5 years ago

I think I found where the issue was. Modified following in https://github.com/uchann2/fluent-plugin-beats/blob/master/lib/lumberjack/beats/server.rb and it seems to do the trick.

Added more error handling for,

  Errno::EAGAIN,
  Errno::EBADF,
  Errno::EACCES,
  IO::EAGAINWaitReadable
    rescue EOFError,
      OpenSSL::SSL::SSLError,
      IOError,
      Errno::ECONNRESET,
      Errno::EPIPE,
      Errno::EAGAIN,
      Errno::EBADF,
      Errno::EACCES,
      IO::EAGAINWaitReadable
      # EOF or other read errors, only action is to shutdown which we'll do in
      # 'ensure'
    rescue