out_forward to a server that's down make whole server's down

dieend commented 9 years ago

  2014-12-18 23:46:32 +0700 [warn]: emit transaction failed  error_class=Fluent::BufferQueueLimitError error=#<Fluent::BufferQueueLimitError: queue size exceeds limit>
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:182:in `block in emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/2.1.0/monitor.rb:211:in `mon_synchronize'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:168:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:427:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:33:in `next'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:172:in `block in emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/2.1.0/monitor.rb:211:in `mon_synchronize'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:168:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:538:in `block in emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/event.rb:55:in `call'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/event.rb:55:in `each'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:528:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-forest-0.3.0/lib/fluent/plugin/out_forest.rb:171:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:33:in `next'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/out_copy.rb:73:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/match.rb:36:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/engine.rb:160:in `emit_stream'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/engine.rb:140:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-record-reformer-0.4.0/lib/fluent/plugin/out_record_reformer.rb:90:in `block in emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/event.rb:129:in `call'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/event.rb:129:in `block in each'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/event.rb:128:in `each'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/event.rb:128:in `each'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-record-reformer-0.4.0/lib/fluent/plugin/out_record_reformer.rb:87:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/match.rb:36:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/engine.rb:160:in `emit_stream'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:227:in `receive_lines'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:318:in `call'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:318:in `wrap_receive_lines'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:511:in `call'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:511:in `on_notify'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:343:in `on_notify'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:324:in `attach'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:133:in `setup_watcher'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:151:in `block in start_watchers'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:138:in `each'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:138:in `start_watchers'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:128:in `refresh_watchers'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:84:in `start'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/engine.rb:267:in `block in start'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/engine.rb:266:in `each'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/engine.rb:266:in `start'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/engine.rb:213:in `run'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/supervisor.rb:480:in `run_engine'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/supervisor.rb:138:in `block in start'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/supervisor.rb:266:in `call'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/supervisor.rb:266:in `main_process'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/supervisor.rb:241:in `block in supervise'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/supervisor.rb:240:in `fork'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/supervisor.rb:240:in `supervise'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/supervisor.rb:131:in `start'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/command/fluentd.rb:168:in `<top (required)>'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:55:in `require'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:55:in `require'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/bin/fluentd:6:in `<top (required)>'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/bin/fluentd:23:in `load'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/bin/fluentd:23:in `<top (required)>'
  2014-12-18 23:46:32 +0700 [warn]: /usr/sbin/td-agent:7:in `load'
  2014-12-18 23:46:32 +0700 [warn]: /usr/sbin/td-agent:7:in `<main>'
  2014-12-18 23:46:32 +0700 [warn]: emit transaction failed  error_class=Fluent::BufferQueueLimitError error=#<Fluent::BufferQueueLimitError: queue size exceeds limit>
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:182:in `block in emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/2.1.0/monitor.rb:211:in `mon_synchronize'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:168:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:427:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:33:in `next'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:172:in `block in emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/2.1.0/monitor.rb:211:in `mon_synchronize'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:168:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:538:in `block in emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/event.rb:55:in `call'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/event.rb:55:in `each'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:528:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-forest-0.3.0/lib/fluent/plugin/out_forest.rb:171:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:33:in `next'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/out_copy.rb:73:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/match.rb:36:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/engine.rb:160:in `emit_stream'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/engine.rb:140:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-record-reformer-0.4.0/lib/fluent/plugin/out_record_reformer.rb:90:in `block in emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/event.rb:55:in `call'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/event.rb:55:in `each'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-record-reformer-0.4.0/lib/fluent/plugin/out_record_reformer.rb:87:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/match.rb:36:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/engine.rb:160:in `emit_stream'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/engine.rb:140:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_forward.rb:168:in `on_message'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_forward.rb:209:in `call'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_forward.rb:209:in `block in on_read'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_forward.rb:225:in `call'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_forward.rb:225:in `<<'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_forward.rb:225:in `on_read_json'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_forward.rb:220:in `call'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_forward.rb:220:in `on_read'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/cool.io-1.2.4/lib/cool.io/io.rb:128:in `on_readable'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/cool.io-1.2.4/lib/cool.io/io.rb:191:in `on_readable'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/cool.io-1.2.4/lib/cool.io/loop.rb:88:in `run_once'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/cool.io-1.2.4/lib/cool.io/loop.rb:88:in `run'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_forward.rb:93:in `run'
  2014-12-18 23:46:32 +0700 [info]: out_forest plants new output: s3 for tag 'var.log.php5-fpm.www.slow.log'
  2014-12-18 23:46:32 +0700 [warn]: emit transaction failed  error_class=Fluent::BufferQueueLimitError error=#<Fluent::BufferQueueLimitError: queue size exceeds limit>
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:182:in `block in emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/2.1.0/monitor.rb:211:in `mon_synchronize'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:168:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:427:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:33:in `next'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:172:in `block in emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/2.1.0/monitor.rb:211:in `mon_synchronize'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:168:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:538:in `block in emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/event.rb:55:in `call'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/event.rb:55:in `each'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:528:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-forest-0.3.0/lib/fluent/plugin/out_forest.rb:171:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:33:in `next'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/out_copy.rb:73:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/match.rb:36:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/engine.rb:160:in `emit_stream'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/engine.rb:140:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-record-reformer-0.4.0/lib/fluent/plugin/out_record_reformer.rb:90:in `block in emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/event.rb:129:in `call'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/event.rb:129:in `block in each'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/event.rb:128:in `each'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/event.rb:128:in `each'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-record-reformer-0.4.0/lib/fluent/plugin/out_record_reformer.rb:87:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/match.rb:36:in `emit'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/engine.rb:160:in `emit_stream'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:227:in `receive_lines'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:318:in `call'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:318:in `wrap_receive_lines'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:511:in `call'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:511:in `on_notify'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:343:in `on_notify'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:324:in `attach'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:133:in `setup_watcher'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:151:in `block in start_watchers'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:138:in `each'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:138:in `start_watchers'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:128:in `refresh_watchers'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/in_tail.rb:84:in `start'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/engine.rb:267:in `block in start'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/engine.rb:266:in `each'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/engine.rb:266:in `start'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/engine.rb:213:in `run'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/supervisor.rb:480:in `run_engine'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/supervisor.rb:138:in `block in start'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/supervisor.rb:266:in `call'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/supervisor.rb:266:in `main_process'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/supervisor.rb:241:in `block in supervise'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/supervisor.rb:240:in `fork'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/supervisor.rb:240:in `supervise'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/supervisor.rb:131:in `start'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/command/fluentd.rb:168:in `<top (required)>'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:55:in `require'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:55:in `require'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/bin/fluentd:6:in `<top (required)>'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/bin/fluentd:23:in `load'
  2014-12-18 23:46:32 +0700 [warn]: /opt/td-agent/embedded/bin/fluentd:23:in `<top (required)>'
  2014-12-18 23:46:32 +0700 [warn]: /usr/sbin/td-agent:7:in `load'
  2014-12-18 23:46:32 +0700 [warn]: /usr/sbin/td-agent:7:in `<main>'

The log output this warning, with stack trace and it's repeat until hundreds of thousand later until the disk full, and the server's down.

I tried to set log_level to error, but it is still happened. I tried to add --suppress-repeated-stacktrace options, the message is reduced, but it's still outputting many many many logs, it's only suppressing the stacktrace, not the messages. The only workaround is by setting global log level -qq.

With this I lost my ability to follow warn error. Is there anyway for me to receive as much as possible log with as little as possible log? (uh, I mean, a way to get log but not repeating infinitely).

My partial configuration:

<match log.**>
  type record_reformer
  renew_record false
  tag reformed.${tag_suffix[1]}
  <record>
     key value
  </record>
  log_level error
</match>

<match reformed.**>
    type copy
    <store>
       type forest
    subtype s3
       remove_prefix reformed
       escape_tag_separator /
       <template>
             <!-- some s3 configuration !-->
        buffer_path /var/log/fluent/__TAG__.*
        buffer_chunk_limit 10m
        flush_interval 10m
        flush_at_shutdown true
        format json
        include_tag_key false
        include_time_key true
       </template>
    </store>
      <store>
     type forward
        <server>
           host somedeadhost
           port 24224
         </server>
     log_level error
         heartbeat_type tcp
     flush_interval 10s
     num_threads 8
     buffer_type file
     buffer_chunk_limit 128kb
     buffer_queue_limit 128
     buffer_path /var/log/fluent/monitor/
      </store>
</match>

repeatedly commented 9 years ago

How about --emit-error-log-interval SECONDS option?

dieend commented 9 years ago

No, it doesn't works. It still invoke too many log.

2014-12-23 11:25:42 +0700 [warn]: temporarily failed to flush the buffer. next_retry=2014-12-23 11:24:59 +0700 error_class="Errno::ETIMEDOUT" error="Connection timed out - connect(2) for \"SOME_IP_HERE\" port 24224" plugin_id="object:3fe1badc22c0"
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/out_forward.rb:262:in `initialize'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/out_forward.rb:262:in `new'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/out_forward.rb:262:in `connect'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/out_forward.rb:224:in `send_data'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/out_forward.rb:144:in `block in write_objects'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/out_forward.rb:138:in `times'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/out_forward.rb:138:in `write_objects'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:450:in `write'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:300:in `write_chunk'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:280:in `pop'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:311:in `try_flush'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:132:in `run'
2014-12-23 11:25:42 +0700 [warn]: temporarily failed to flush the buffer. next_retry=2014-12-23 11:25:42 +0700 error_class="RuntimeError" error="no nodes are available" plugin_id="object:3fe1badc22c0"
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/out_forward.rb:156:in `write_objects'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:450:in `write'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:300:in `write_chunk'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:280:in `pop'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:311:in `try_flush'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:132:in `run'
2014-12-23 11:25:42 +0700 [warn]: temporarily failed to flush the buffer. next_retry=2014-12-23 11:25:44 +0700 error_class="RuntimeError" error="no nodes are available" plugin_id="object:3fe1b5d8ed58"
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/out_forward.rb:156:in `write_objects'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:450:in `write'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:300:in `write_chunk'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:280:in `pop'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:311:in `try_flush'
  2014-12-23 11:25:42 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:132:in `run'
2014-12-23 11:25:43 +0700 [warn]: temporarily failed to flush the buffer. next_retry=2014-12-23 11:26:17 +0700 error_class="RuntimeError" error="no nodes are available" plugin_id="object:3fe1c0068df8"
  2014-12-23 11:25:43 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/out_forward.rb:156:in `write_objects'
  2014-12-23 11:25:43 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:450:in `write'
  2014-12-23 11:25:43 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:300:in `write_chunk'
  2014-12-23 11:25:43 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:280:in `pop'
  2014-12-23 11:25:43 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:311:in `try_flush'
  2014-12-23 11:25:43 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:132:in `run'
2014-12-23 11:25:43 +0700 [warn]: temporarily failed to flush the buffer. next_retry=2014-12-23 11:25:46 +0700 error_class="RuntimeError" error="no nodes are available" plugin_id="object:3fe1b99c3d28"
  2014-12-23 11:25:43 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/plugin/out_forward.rb:156:in `write_objects'
  2014-12-23 11:25:43 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:450:in `write'
  2014-12-23 11:25:43 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:300:in `write_chunk'
  2014-12-23 11:25:43 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/buffer.rb:280:in `pop'
  2014-12-23 11:25:43 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:311:in `try_flush'
  2014-12-23 11:25:43 +0700 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.57/lib/fluent/output.rb:132:in `run'

repeatedly commented 9 years ago

Sorry for the delay.

error="no nodes are available" plugin_id="object:3fe1badc22c0" error="no nodes are available" plugin_id="object:3fe1b5d8ed58" error="no nodes are available" plugin_id="object:3fe1c0068df8" error="no nodes are available" plugin_id="object:3fe1b99c3d28"

Different plugin_id in each log is weird. I will check it later.

repeatedly commented 9 years ago

Ah, okay. num_thread prevent stacktrace suppression. I'm now thinking how to fix the problem.

dieend commented 9 years ago

In line with this issues, warning also emitted too many logs, that make hardisk full. The warning is about buffer queue size exceeds limit.

2015-01-26 09:25:14 +0700 [warn]: record_reformer: Fluent::BufferQueueLimitError queue size exceeds limit /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.58/lib/fluent/buffer.rb:182:in `block in emit'

There is 11 line emitted in 1s, you can imagine how long it needs to make 8gb hardisk full.

repeatedly commented 9 years ago

It seems your fluentd instance receives lots of request or sends events to stucked plugin. How about following approaches to resolve BufferQueueLimitError issue?

Set larger value to buffer_queue_limit
Set smaller retry_limit with <secondary> to backup events.
Setup one more fluentd and load balancing between two fluentd nodes.

dieend commented 9 years ago

That's a good idea, but should one plugin failure make whole server down? Currently we are not investing on adding another nodes. (At least until we get bigger traffic).

repeatedly commented 9 years ago

should one plugin failure make whole server down?

Should not down. In this problem, long time destination error with small buffer capacity causes lots of BufferQueueLimitError. Separating stream relaxes this problem.

dieend commented 9 years ago

How about adding max log size options? So whatever the log error, it will never pass specific size (let's say, 1GB, but it should be configurable).

repeatedly commented 9 years ago

So whatever the log error, it will never pass specific size

What does line mean? Should input plugin block when buffers are filled?

ghost commented 9 years ago

What does line mean? Should input plugin block when buffers are filled?

I think what he means is to set a maximum size the log can grow to. Beyond that size, older events would be removed (either deleted or sent to a backup/archive file) and new events added to keep the log size below the max size setting. Ultimately the size on disk of the log files need to be controlled so if sending to a backup/archive, need to control the size and number of those files as well.

repeatedly commented 9 years ago

Hmm... I see. We are now developing new plugin API including buffer. Maybe, this is a chance to add another strategy to buffer mechanizm.

tagomoris commented 9 years ago

Is this thread about logs of Fluentd itself? New Buffer plugin API have nothing about logging. Does @RIKIL mean about log rotation of Fluentd process itself?

repeatedly commented 9 years ago

Hmm... from "Beyond that size, older events would be removed (either deleted or sent to a backup/archive file) and new events added to keep the log size below the max size setting. ", I assume he mentions buffers's queue. And maybe, I think dieend's intention is also buffer. If it means a fluentd's logging, yeah, buffer plugin is not related.

mr-salty commented 9 years ago

We also have seen similar issues, as described in the original report - it will keep logging some message repeatedly, using a lot of CPU in the process, and eventually filling up the disk.

I think this is partially an issue with the volume of log messages fluentd generates (which can be mitigated to some extent), but my bigger concern is whether some of these issues are independent of logging, i.e. even if we disabled logging, are there situations where fluentd gets stuck in a tight loop, trying to do something that fails repeatedly due to external factors?

Looks like there are a few things that can mitigate the log impact, at least:

log rotation should keep the disk from filling up
--suppress-repeated-stacktrace helps reduce the volume in some cases (should this be the default)
I was going to suggest we should use things like "log every N messages" or "log every N seconds", but it looks like we already have that (at least partially) in the form of 'suppress_interval' - although it appears this is also off by default and not documented outside of the code.

mr-salty commented 9 years ago

ah, I see 'emit_error_log_interval' is mentioned in http://docs.fluentd.org/articles/config-file but it is not explained. I tried it, and it does work, except it only suppresses specific messages.

For instance, in my case I did see 'emit transaction failed' only every 10s (as I had configured), but I still see 'syslog failed to emit' every time.

repeatedly commented 9 years ago

Yeah, we need to improve logging mechanizm for better control.

are there situations where fluentd gets stuck in a tight loop

It should not happen in the fluentd's main thread. In the plugin thread, it may happen by plugin bug.

log rotation should keep the disk from filling up

We are now using logrotate for fluentd(td-agent) logs. Do you see any concern?

--suppress-repeated-stacktrace helps reduce the volume in some cases (should this be the default)

From v0.12, this is true by default.

although it appears this is also off by default and not documented outside of the code. but it is not explained.

Ah! I will update documents later.

I still see 'syslog failed to emit' every time.

Hmm... currently, fluentd uses suppression per thread. We will change this to suppression per plugin.

skippy commented 8 years ago

I ran into this in 0.12.[16|18] even using --emit-error-log-interval. I've found it pretty easy to recreate:

using the elasticsearch plugin. set the buffers very low
tcp input
journalctl -m -o json -f | ncat -w 2s -i 5s --send-only ${fluentd_ip} ${FLUENTD_PORT}

and once the buffers fill up, the logs explode with error msgs, journalD pegs a core trying to keep up, and the disk eventually fills up. And what also concerned me is the server kept accepting new messages on the tcp_input.

This all occurs even if --emit-error-log-interval is set.

Short term fix

increase the buffers a bit (though I don't want to make them huge either!)
add throttling logic to the script sending information to fluentD from journald
hook in a consul health check which looks at the status of the fluentD buffers every 15 seconds...and if a set number of buffers are used, it flags that particular fluentD aggregator as failing and pulls it out of DNS... the scripts pick this up within seconds and either hit another fluentD aggregator or stop shipping logs until one is available.

# consul uses nagios like checks so `exit 1` is a warning and `exit 2` is failure
( $(curl -s http://127.0.0.1:24220/api/plugins.json | jq -e '[.plugins[].buffer_queue_length] | sort | reverse | .[0] < 5') && echo 'running fine (buffers within limits)') || (echo 'getting too far behind; buffer is full.' && exit 2)

QUESTION

This surprised me as it seems like a common error condition. Am I missing anything? I can recreate it pretty easily so I suspect not but I would love to be proven wrong.

And how can I have fluentD stop accepting incoming tcp requests if the buffer fills up?

Production Outage

This issue actually took down are entire cluster of fluentD Aggregators in a few minutes:

es cluster went down. It wasn't on an emergency list as it is only used for retroactive log reporting, so it was flagged as a business-hours fix
a fluentD aggregator hit the limit and it started spewing a 2 error lines, including the exception and full record. This server was logging 2,000+ errors a second
this now failing aggregator was logging to the host's journalD. Each host has a journalD forwarding script, and this one was sending logs to a different aggregator.
and then that aggregator was flooded and filled up the logs

repeatedly commented 8 years ago

add throttling logic to the script sending information to fluentD from journald

Fluentd's buffer is the one of throttiling mechanizm. Providing blocking option may help this problem?

And how can I have fluentD stop accepting incoming tcp requests if the buffer fills up?

It needs several modification. Michael suggests block option or input plugin handles buffer full error. For latter, back-pressure seems the best way. Input block is not always the best way but providing two behaviours is better.

It wasn't on an emergency list as it is only used for retroactive log reporting, so it was flagged as a business-hours fix

For such cases, small retry_limit and using <secondary> for backup is an another way. After recovered ES cluster, we can restore logs using backed up files. Some companies use this approach.

This server was logging 2,000+ errors a second

Did you set --emit-error-log-interval but it generates the errors? What the error did you get?

tagomoris commented 7 years ago

No responses and updates. So let me close now. Please reopen if the problem still exists.

fluent / fluentd

out_forward to a server that's down make whole server's down #523